M. Willis Monroe

Visualizing Neo-Assyrian Scholars in Python

Just as I was writing this post an author on realpython.com released an excellent overview of using Matplotlib for Python plotting, a highly recommended read!

(Skip to the historical context below)

SAA 8 15 Cuneiform letter from Issar-šumu-ereš to the king (http://oracc.museum.upenn.edu/saao/saa08/P336444/html)

Introduction:

In the past couple of years I’ve been trying to meld my long history of casually programming Python with my scholarly work on scribes and knowledge from the cuneiform world.  This has translated more recently into a couple of papers at conferences and a forthcoming chapter in a book.  However, as most scholars would probably relate, much of the programming “research” happens at the very last-minute; either in the closing days of an abstract deadline or in the weeks leading up to a conference.  I decided I wanted to spend a little bit of time thinking about how I gather and process data, so that next time I’m crunched between a deadline and results I can lean on some of the techniques I’ve worked to perfect.

To that end, I picked up two of the catalogs of texts from the open-access ORACC project State Archives of Assyria online.  I chose catalogs from volume 8, as it’s a personal favorite of mine (Astrological Reports to Assyrian Kings by Herman Hunger), and volume 10 (Letters from Assyrian and Babylonian Scholars by Simo Parpola) because it also includes letters from scholars.  These scholars were principally writing to the king to address questions and concerns he might have had about the fate of himself as well as of the land and country. The scholars used a variety of methods to answer his queries, often referencing handbooks of important omens. We can use these catalogs to investigate who was writing the king when and try to place these patterns in their historical contexts.

The catalog file is spit out by the ORACC servers in a JSON format and it’s relatively easy to work with:

 
    {
      "type": "catalogue",
      "project": "saao/saa08",
      "source": "http://oracc.org/saao/saa08",
      "license": "This data is released under the CC0 license",
      "license-url": "https://creativecommons.org/publicdomain/zero/1.0/",
      "more-info": "http://oracc.org/doc/opendata/",
      "UTC-timestamp": "2017-06-21T23:31:15",
      "members": {
        "P236880": {
          "project": "saao/saa08",
          "ancient_author": "Ašaredu the Older",
          "astron_date": "a-668-03-16",
    ...
        "P236976": {
          "project": "saao/saa08",
          "ancient_author": "Nabu-šuma-iškun",
          "astron_date": "a-672-11-15",
    

Each text is contained within the “members” property, and identified by its “P-number”. From there we can access the various properties relevant to our analysis.

Processing with Python:

The code below was all written in a Python Jupyter-notebook (incidentally I was trying out the new JupyterLab while writing this and quite liked it). These notebooks make it easy to iterate through a process of data analysis and visualization and describe the steps along the way. You can view a version of the code below here.

We begin with some standard boilerplate and imports.  We need my graphs to appear inline so we can see them and change them if necessary.  We also need to import a number of standard Python libraries: json for working with the JSON file, Matplotlib for graphing, and finally two handy tools from the ever-useful collections library.

%matplotlib inline

import json
import matplotlib.pyplot as plt

from collections import Counter, OrderedDict

With the standard library tools we’re going to use loaded, we can move on to actually getting our data out of the JSON file, this is just a quick python three liner per catalog, then we join both catalogs together:

filename = './saa_8_catalogue.json'
with open(filename) as f:
    catalog_data_saa_8 = json.load(f)
filename = './saa_10_catalogue.json'
with open(filename) as f:
    catalog_data_saa_10 = json.load(f)
all_texts = {**catalog_data_saa_10["members"], **catalog_data_saa_8["members"]}

Now, the catalog data is represented in our work environment by a handy Python dictionary.  With that done, the first thing we’re going to attempt is figuring out how many texts each scholar wrote.  We can do that by going through each text and keeping a running tally for each author we encounter.  There are two ways to do this, the first uses a more verbose and old-fashioned for loop:

author_counts = {}
for _, item in data["members"].items():
    author = item["ancient_author"]
    if author in author_counts:
        author_counts[author] += 1
    else:
        author_counts[author] = 1

The second uses the handy Counter class from the collections library, and makes use of a type of syntax called list-comprehension:

author_counts = Counter( for text in catalog_data["members"].values()])

Either way we construct the counts we need to make sure our end result is sorted by author, and we do this by using the other imported class from the collections library, the OrderedDict. This class maintains the order of keys in a dictionary, which up until recently was not guaranteed by Python.

author_counts = OrderedDict(sorted(author_counts.items()))

Before we start modifying the data it would also be nice to know if each author has a favored scholarly genre. We can do that by extracting text from the “subgenre” field for each text and assigning the values to each author and finding the most common value. This code is a bit complex in order to deal with the variety of data found in the catalogs. We also normalize it by assigning every text that comes from SAA 8 a value of “astrologers” since we know it’s coming from an astrologer. With this dictionary finished, we will be able to easily pass it an author and find out what their most common scholarly genre was:

genres = {}
for text in all_texts:
    author = all_texts.get("ancient_author")
    genre = "None"
    try:
        genre = all_texts.get("subgenre").split()[1]
        if all_texts.get("volume") == "SAA 8":
            genre = "astrologers"
    except:
        pass
    if author in genres:
        genres[author].append(genre)
    else:
        genres[author] = [genre]
genres = {author:max(genres[author]) for author in genres}

Next we want to be able to filter our results, there are roughly 100 scholars ascribed authorship in the volume (including “unassigned” and joint-authored texts) but for the purposes of graphing the data we’re only interested in those who wrote more than fifteen texts. Here we define my min and max parameters, we set them up first so that we can change them easily later on if we want to narrow my analysis more by restricting the dataset. Next we filter my existing dictionary of authors and counts by these min and max parameters. I’ve included the for-loop version as well, but in this case I opted to make use of some fancy dictionary comprehension.

min_c, max_c = 0, max(author_counts.values())
min_c = 15
# for author, count in author_counts.items():
#     if min_c < count < max_c:
#         authors.append(author)
#         counts.append(count)
filtered_counts = {author:count for author, count in author_counts.items() if min_c < count < max_c}

With that done we’re actually ready to graph our first result. We’ve got a dictionary where each key is an author who wrote more than fifteen texts, and the value for each author in the dictionary is the number of texts they wrote.

So the next step is to use the Matplotlib library to create a horizontal bar-graph of our data. This code can seem quite opaque, and part of the impetus behind this whole experiment was to try and understanding graphing a bit better. I opted to include a bunch of extra code to make the graph look nicer. I’ve been trying to figure out if there’s a logical order to the code used to construct a graph with Matplotlib. In this case I opted to define the variables that I would need to represent my data in the graph first, then configure general properties of the graphing environment. Next I would make some changes to this graph, in particular, label the graph, finally I would call the actual graphing function, in this case barh, with the variables I defined at the beginning.

counts = filtered_counts
labels = ["{} ({})".format(author, genres[author]) for author in list(counts)[::-1]]
data = list(counts.values())[::-1]
# Attempt to make the plot look better:
plt.figure(figsize=(6, 7)) 
plt.style.use('fivethirtyeight')
plt.rcParams['font.family'] = 'serif'
plt.rcParams['font.size'] = 12

ax = plt.subplot(111) # remove borders
ax.spines["top"].set_visible(False)    
ax.spines["bottom"].set_visible(False)    
ax.spines["right"].set_visible(False)    
ax.spines["left"].set_visible(False) #
ax.grid(axis="x", color="black", alpha=0.3, linestyle="--") # grid lines
plt.tick_params(axis="both", which="both", bottom="off", top="off", # remove tick lines
                labelbottom="on", left="off", right="off", labelleft="on") 
plt.title("Text authored by Neo-Assyrian Scholars (n > 15)")
ax.set_xlabel("Numbers of texts in corpus")

# Plot the actual data:
plt.barh(range(len(counts)), data, tick_label=labels) 
plt.show()

What is quite clear from this graph is that astrology far outweighs any other type of scholarship in the court. This is slightly misleading as we explicitly sampled a volume only dedicated to astrology. But we also know that astrologers were some of the most important advisers to the king, and their interpretation of omens was considered to be a preeminent form of divination (Fincke, 2017, 392).

The great benefit of the scholarly reports to the king is that when they include astronomical observation we can sometimes date the observation itself. There are some caveats to this approach, obviously we have to take the report itself as a true observation, secondly a text could report an observation that occurred before (sometimes well before) the text was written. With all of this in mind, the next graph we’ll attempt to make from the data is a timeline of the same scholars seen above. To start with, the data from the catalog includes two fields “date” and “astron_date” which generally looks like this: "a-668-03-16". We only really care about the year and some fraction thereof, so we roughly normalize the date:

def get_year(text):
    if text is not None:
        try:
            year, month, day = map(int, text.split('-'))
            if year == 0:
                return 0
            date = year + month * 1/12 + day * 1/30 * 1/12
            return date
        except:
            pass
    return 0
def get_astron_year(text):
    if text is not None:
        try:
            _, year, month, day = map(int, text.split('-')[1:])
            if year == 0:
                return 0
            date = year + month * 1/12 + day * 1/30 * 1/12
            return date
        except:
            pass
    return 0

Because not every text can be dated we need to be a bit more careful when we construct the data for this graph as requesting the “astron_date” or “date” field for each text will result in an error if it’s missing.

author_years = {}
years = []
for _, item in all_texts.items():
    year = 0
    year = get_year(item.get("date"))
    if year == 0:
        year = get_astron_year(item.get("astron_date"))
    if year > 0:
        years.append(year)
        author = item["ancient_author"]
        if author in author_years:
            author_years[author].append(year)
        else:
            author_years[author] = [year]

Next we want to filter and sort our data again:

min_c, max_c = 0, max(author_counts.values())
min_c = 15
author_years = {author:years for author, years in author_years.items() if min_c < author_counts[author] < max_c}
author_years = OrderedDict(sorted(author_years.items()))

And, because we’re going to graph each scholar on a timeline for the entire period, we need the figure out the date of the earliest and latest texts, and do the same thing for each scholar as well.

min_year, max_year, range_years = min(years), max(years), max(years) - min(years)
# author_years_active = {}
# for author, years in author_years.items():
#     author_years_active[author] = [max(years) - min(years), min(years), max(years)]
author_years_active = {author:[max(years) - min(years), min(years), max(years)] for author, years in author_years.items()}

Finally, we’re ready to make our next graph. Following the convention above I define my data first, do some general manipulation, and specific changes to this graph, finally set the title, labels, and graph the actual data. This graph also plots two vertical lines marking the beginning of Esarhaddon and Ashurbanipal’s reigns:

ranges = [years[0] for author, years in author_years_active.items()][::-1]
starts = [years[1] for author, years in author_years_active.items()][::-1]
labels = ["{} ({})".format(author, count) for author, count in filtered_counts.items()][::-1]
# Attempt to make the plot look better:
plt.figure(figsize=(6, 7)) 
plt.style.use('fivethirtyeight')
plt.rcParams['font.family'] = 'serif'
plt.rcParams['font.size'] = 12

ax = plt.subplot(111) # remove borders
ax.spines["top"].set_visible(False)    
ax.spines["bottom"].set_visible(False)    
ax.spines["right"].set_visible(False)    
ax.spines["left"].set_visible(False)
plt.tick_params(axis="both", which="both", bottom="off", top="off",    # remove ticklines
                labelbottom="on", left="off", right="off", labelleft="on") 

ax.set_xlabel("Years BCE")
plt.title("Years active for Neo-Assyrian Astrologers")
plt.barh(range(len(author_years_active)), ranges, left=starts, tick_label=labels, alpha=0.75) # plot the actual data

xmin, xmax = plt.xlim() # reverse the x-axis
plt.xlim(710, xmin-5)
plt.ylim(-0.75, len(author_years_active))
ax.axvline(710, color="black", alpha=0.3, linewidth=2)
ax.axhline(-0.75, color="black", alpha=0.3, linewidth=1.5)
ax.vlines(range(705,640,-10), len(author_years_active), -0.75, color="black", alpha=0.3, linestyle="--", linewidth=1)
# Plot individual texts
for i, author in enumerate(list(author_years_active)[::-1]):
    ax.plot(author_years[author], [i]*len(author_years[author]),  'bo', alpha=0.5)

# Line for Essarhaddon and Assurbanipal's accessions to the throne
ax.axvline(680, color="red", alpha=0.3, linewidth=3)
ax.axvline(668, color="red", alpha=0.3, linewidth=3)

plt.show()

One of the benefits of this approach is that clear and distinct variable names can easily be re-used later for other forms of analysis. In the process of creating the two previous graphs we also happened to make everything we need to see the distribution of these texts over time for each author.

plt.figure(figsize=(10, 8)) 
plt.style.use('fivethirtyeight')
plt.rcParams['font.family'] = 'serif'
plt.rcParams['font.size'] = 12
plt.title("Distributon for texts from Astrologers")
ax = plt.subplot(111) # remove borders
ax.spines["top"].set_visible(False)    
ax.spines["bottom"].set_visible(False)    
ax.spines["right"].set_visible(False)    
ax.spines["left"].set_visible(False) 
plt.tick_params(axis="both", which="both", bottom="off", top="off",    # remove ticklines
                labelbottom="on", left="off", right="off", labelleft="on") 
plt.violinplot(list(author_years.values()), showextrema=False)
ax.set_ylabel("Years BCE")
ax.set_xticks(range(1, len(author_years)+1))
ax.set_xticklabels(list(author_years), rotation=90)

# Line for Essarhaddon and Assurbanipal's accessions to the throne
line = ax.axhline(680, color="red", alpha=0.3, linewidth=3)
ax.axhline(668, color="red", alpha=0.3, linewidth=3)

plt.show()

Historical Context:

Finally, it’s worth pointing out that all of this data-centered processing can be used to say something about the history that we study. There’s very good evidence that Esarhaddon was rightly concerned about the accession of his chosen heir Assurbanipal (Frahm, 2017, 188-189). Letters and reports detail multiple uprising and attempts to overthrow Esarhaddon’s rule ending in a purge of treasonous high-officials in 670 BCE. This period was also the scene of a highly ambitious attempt by Esarhaddon to get all officials of the empire to swear to respect the transition of power between father and son. Esarhaddon dies in 669, and his son Assurbanipal succeeded him and becomes the last great king of the Neo-Assyrian empire. So whatever precautionary steps and measures Esarhaddon took, it seems to have worked.

With that as a historical background, we can now see the graphs in light of a king’s concern with treason, uprising, and a worry about their heir-designate. In particular the focus of letters from scholars right around the height of Esarhaddon’s struggle to maintain power seem to indicate a preoccupation with figuring out what the stars and other omens could tell him. As an example, the left most name in the above graph is Adad-šumu-uṣur, Esarhaddon’s chief exorcist (Radner, 2017, 221). The preserved texts indicate a clustering of texts from him right before Esarhaddon’s death and Assurbanipal’s accession. It is likely that Esarhaddon was relying on his chief exorcist both to verify the veracity of reports and to double check reported celestial omens. Obviously, a proper attempt at this analysis would want to look at the entire corpus of letters, including the large corpus of extispicy queries. However, this short overview of the evidence gives us a picture of which scholars were writing to the kings and when.

Acknowledgements:

This analysis wouldn’t be possible without the open-access CC-licensed data and framework from the ORACC project. And the data wouldn’t exist without the work of Mikko Lukko who digitized SAA 8 and 10 for the State Archives of Assyria online project.

Bibliography: