Source Scraping and Coding Confirmation


This continues our series of student reflections and analysis authored by our research team.


Source Scraping and Coding Confirmation

Emma Lovejoy

When we look for new cases to include in the tPP database, it’s always helpful to find an existing compilation we can pull names and data from.  While cases in some existing databases automatically meet the criteria for inclusion (lists specifically dedicated to terrorist acts, political extremism, etc), for other sources each case must be individually investigated, and a determination made as to whether they should be included in tPP or not.  For sources like this, it’s important that we take the time to ensure we’re not adding extraneous cases, and that the information being added as up-to date.

When a case is located that we think may need to be included, the first step is to check the defendant’s name against each of our active spreadsheets, to ensure the case really is new to tPP.  In cases where we’ve already documented the case, this is an opportunity to see if there are updates to be made; if variables we’d had trouble with in the past are clarified by new source material, if the trial has progressed, etc.  If the case does not appear in our data yet, we move on to source collection.

An easy place to start when a case’s inclusion is still questionable is looking at news stories.  They’re usually easier to find than court documents, and can give a general picture of the crime and the defendant.  Based on what we see in the news, we can usually make a judgement at least on whether or not the case meets the criteria for inclusion, even if the details of their ideology remains unclear.  If it is a case for tPP, especially useful articles will be saved as source files, and known information (name, dates, location etc) as well as the dataset it was originally pulled from will be added to the working spreadsheet as a case-starter, to be coded.

If the case could have been included were it not for an exclusionary factor (charges not reaching felony-level, death prior to charging, etc) then the basic case-starter information is filed separately as an excluded case, with sources and an explanation of why.  We do this to save ourselves time down the line, if the case should come up again in the future.  Given the overlapping content of various datasets, it’s not uncommon for an excluded case to to raise flags on more than one occasion, so having this index improves our ability to work through external data efficiently.

Most recently, we have been working on developing a collective procedure for the scraping process.  That is, a system in which each stage of examination is assigned to an individual or group, to expedite the scraping of each document. So far, this assembly-line approach has yielded hundreds of new cases to be incorporated.

The Case of Keith Luke (Part 1 of 3: general overview & background)


This continues our series of student reflections and analysis authored by our research team.


The Case of Keith Luke (Part 1 of 3: general overview & background)

Caitlin Marsengill

This journal is the first in a three part series on the case of Keith Luke. The first journal is a general overview and background information on the case. The second journal will be on the prosecution and legal proceedings surrounding the case, and the final journal be an analysis of the case. As a disclaimer this case is particularly egregious as the crimes he committed sexual assault, violent, and racially motivated.

Keith Luke is a neo-nazi and white supremacist from Brockton, Massachusetts. He went on a rampage that killed two people and shot and raped another person.  All three victims of this crime were of Cape Verdean descent and he sought them out because of their race. He premeditated the crimes for months. Other than the racial motivations, he also committed the rape because he said that he had been turned down “100,000 (expletive) times” and he did not want to die a virgin. While he was sexually assaulting the woman, another woman came home and walked in on the act so Luke decided to shoot her. Then he shot the woman he had be raping and then got in his car, cranked the music up, and left. As he was driving down the street he spotted his next victim, who was a 72 year old homeless man pushing a carriage. Luke had bought a gun and over 200 rounds of ammunition. He had planned to end his rampage by shooting and killing bingo players at a synagogue. He was attempting to reload his gun while driving but was having difficulty then the police caught up with him and he attempted to shoot at them and then crashed his vehicle. Luke later said that he regretted shooting at the police officers because they were white.

Many details describing the unusual behaviors Luke exhibited throughout his life became evident in his trial from both testimonies as well as actions Luke himself took. These will be discussed more in detail in the next journal entry in the series. Luke was ultimately convicted of killing two people and raping and shooting a third person

Something that was interesting with coding this case is that he wanted to commit the rape so that he did not die a virgin, which initially caused some question as whether or not the case should be included in the project. However, it quickly became evident that this was motivated by his socio-political beliefs and that he had picked the victims due to their race and his hatred towards minorities after reviewing more sources. It was also interesting the varying amount of details different articles gave and it felt like no single article gave the entire story so we had to piece together the whole story from multiple articles that gave different details. Due to this, it constantly felt like we were finding new bits of information and details.

These problems we ran into while coding the case speak to the difficulty of using documentary data sources and how we have to be cautious about the sources used within our research and their credibility, reliability, representativeness, and ethics (Caulfield and Hill, 2014). There is clear bias within the way these articles present the facts, however that worked to our benefit in some of the articles as they went out of their way to include details about his motive, socio-political motive, and his ideological target. As Caulfield and Hill also discuss in their chapter, we can come to trust our sources due to the underlying facts of the case being a commonality in all of the articles and agreeing with other types of documents such as court records that we found.

 

References:

Caulfield, Laura, and Jane Hill. 2014. Criminological Research for Beginners: A Student’s Guide. 1st edition. Abingdon, Oxon: Routledge. [Excerpt: Chapter 10, “Using documentary and secondary data sources”]

It’s the Government’s Say (Part 2): On the Topic of ‘Other Status’


This continues our series of student reflections and analysis authored by our research team.


It’s the Government’s Say (Part 2): On the Topic of ‘Other Status’

Emily Ashner

When coding for the case of Abdirizak Haji Raghe Wehelie, a federal contractor for the FBI and worked as a linguist translating communications captured by court-authorized surveillance of a suspect in a terrorism investigation, the DOJ released his middle name as “Jaji”, likely on accident, but for those who know Ararbic this has a very a different definition. While this is not a major issue, there are ways the government utilizes Muslim and Arab names and references that influences workers within the government and citizens.

The Institute for Policy and Understanding released statistics that showed the severity differences between Muslims and non-muslim perpetrators who committed the same crime.1 These statistics show that Muslims receive a severe punishment 83% of the time while non-muslim only 17%. Further, average prison sentences are four times higher if the perpetrator is perceived to be Muslim. This is why in the Prosecution Project we code for “other status.” The codebook defines this status if they meet any of the following: Does the defendant have a name not readily understood as European?; Is the defendant Muslim or a Muslim convert?; Is the defendant an immigrant from a non-Western/European country?

Clearly, the way government officials view perpetrators has an extreme effect on how they are sentenced. However, the way this information is presented also has incredible implicit bias effects on citizens, who are potential jury members. According to a study on implicit attitudes towards Arab-muslims, participants showed an implicit negative attitude towards Arab-muslims over both whites and blacks.2 Interestingly, prejudice could be moderated if the participants were exposed to positive values of Arab-muslims first.

There is a wide understanding in the negative role the media can play, but government documents that immediately label a person’s race or religion has clear effects on attitudes. Understanding the availability heuristic, the tendency to apply the group first thought of when  a statement is seen or heard, can be important in conscious understanding of remaining unbiased. Once again, our process for the project is quite objective so coding is not affected, but is important to consider when thinking about how we can apply the results of our coding and other similar projects to mediate the discriminatory nature of justice system sentencing.

Notes

  1. Rao, Kumar, Carey Shenkman, Khwaja Ahmed, Hasher Nisar, Dalia Mogahed, Sarrah Bugageila, Katherine Coplen, and Katie Grimes. “Equal Treatment? Measuring the Legal and Media Responses to Ideologically Motivated Violence in the United States.” Washington, DC: Institute for Social Policy & Understanding, April 2018.
  2. Park, Jaihyun, Karla Felix, and Grace Lee. 2007. “Implicit Attitudes towards Arab-Muslims and the Moderating Effects of Social Information.” Basic and Applied Social Psychology 29 (1): 35–45. doi:10.1080/01973530701330942.

Inter-Coder Reliability Between Projects


This continues our series of student reflections and analysis authored by our research team.


Inter-Coder Reliability Between Projects

Stephanie Sorich

During the week of November 18, many members of the Project were given an assignment focused on “scraping” documents. Essentially with scraping, coders search the names of defendants for potential cases in our project database to determine if we’ve already coded a case, or if we’ve found a new case to code. Coders set up what we called an “assembly line,” several picking out names to be searched, several checking our spreadsheets to see if those names were already codes, and the last few began the cases that weren’t yet coded. It was a very efficient way of getting new potential cases on the board, even if they can’t be finished immediately.

However, rather than taking potential cases from police reports or news articles like is common in the Project, this assembly line was focusing on pulling potential cases from other research projects. The Threat Within provided us with a spreadsheet of cases compiled on foreign terror, as well as several other lists from Homeland Security and other organizations. Being able to compare cases with other projects and organizations working within the same field provided a tremendous opportunity to clarify the reliability of our work.

To a certain extent, we can check for the reliability of our results within our own project. The practice of checking for inter-coder reliability, or making sure that separate coders receive the same result when looking at a case, provides insight into whether coders are using the same standards and coding by the manual in the same way. More reliable coding makes it more likely that the values being coded are valid, as multiple coders are finding the same end results.

However, coding correctly by the manual does not necessarily mean that the cases are being represented accurately. An issue of checking reliability and validity of the Project within the Project itself is the potential for groupthink. The inability to consult outside minds or consider other perspectives on coding outside those in the room with us each day could cause coders to accept variables and values as true rather than as things to be changed to better fit the project as it develops. Therefore, we took this scraping activity as an opportunity to check our results based upon cases from the lists provided that the Project had already coded for. In many cases, we found that our coding matched that of other projects or organizations, giving us sufficient evidence to believe that our methods of coding and the variables and values we are using are adequate.

In a sense, checking results between projects becomes its own method of analysis. While mostly used as a means to complete the research already being done, performing this comparison could be used as its own analysis between the results of tPP and other projects in order to get a grasp of the general ideas behind terrorism research. While not necessarily for publishing, it could prove to be a useful tool when further evaluating our coding process.

At the moment, it looks as if we are on the right track based upon the comparisons done between our data and others’. However, continuing to scrape new cases from other organizations and compare those done mutually between projects should and likely will be a priority of the Project. While we are working with a group of capable individuals who work carefully to produce the best results, breaking away from the group to get an outside perspective is the best way to awaken the parts of us that take our decisions as absolute with no room for suggestion.

For more information on groupthink as it pertains to team projects: https://pdfs.semanticscholar.org/be0c/9ebe64eb4c77673404f77f08cdc5600f97ef.pdf

My Year in Review


This continues our series of student reflections and analysis authored by our research team.


My Year in Review

Emily Ashner

As the 2019 year wraps up I approach my one year start with tPP. Looking back to when I asked to join the project at this point last year, I was at such a different knowledge level in terms of terrorism and extremism. I truly did not understand the meaning of these terms, the extent of domestic terrorism, and the variance of crimes that fall within our requirements on the project. After coding cases for two semesters, I have developed such a strong comprehension for terms of the court, how the prosecution process works, and the details of crime that is so often discussed in the media. I feel so much more connected to current events and a more active member in this realm.

Working on this project has not only provided me with a knowledge base of specific on the projects but has given me so many applicable skills. Big data is the new norm; efficient processing is being used across many domains. The ability to understand not only how this type of processing works, but the opportunity to add to and adapt the system are long lasting skills. The allowance of creativity and input within this process has built a new approach to finding best practice and expanding data in different ways. I appreciate the underlying aspects in which this project has strengthened skills that are applicable far beyond the prosecution field.

As a psychology major, I was unsure of how knowledge in this field would pave my future interests. In an unexpected manner, tPP created such a strong interest in the effects of bias. Working within judgment and decision making research I have understood how strong the influence of bias is. However, I did not understand application until a saw an intersection between this and the results of prosecution in the United States. Not only are judges, jury, and other members within the criminal proceedings driven by personal prejudice, whether conscious or unconscious, but these outcomes therefore influence the bias of society. There are endless cycles found in systemic discrepancy and the only way to break them is to first be conscious of their existence and then act in a manner that opposes this automatic process. Cycles are dangerous in creating continuous disadvantage. Whether the aim of the project was meant to discover this or not, the numbers bring light to the situation. There is such a strong ability of application that can arise from this project and I am excited to see where others may take this and where I can utilize a similar type of information moving forward in my career.

Discussing acts of terrorism brings such a strong emotional component. The large acts that are quickly associated with such are events that affect so many people. These are events that can connect us, but it is important to realize these are also events that are isolating those not responsible. I thank this project for this realization of such. I thank this project for giving me an accessible outlet to gain knowledge on the current state of this type of event and more so the ability to analyze them objectively. tPP has provided me with so many skills I will carry much past this project and I hope students of all majors will take advantage of such a unique project. Dr. Loadenthal and all members of the project are so dedicated and are creating something truly incredible. Thank you tPP, I will miss working on the project but have no doubt of the future success to come!

The Realities of Research Work: A semester reflection


This continues our series of student reflections and analysis authored by our research team.


The Realities of Research Work: A semester reflection

Stephanie Sorich

When I pictured my first time participating on a research team, I pictured something grandiose: Beautiful Mind-styled webs of ideas pasted around on the walls, philosophical discussion about the social consequences of our results, and a glamorous story to tell my friends and family at home. I ended up with something different.

Grunt work. Lots and lots of grunt work.

At least, that’s how it felt originally. Coming onto this project, the first few days were mostly just filled with confusion over the newness of everything. Combine this with the feeling of combing through source files for the very first time and the tedious feeling of trying to navigate the team’s drive. After week one, I was kicking myself for not taking ice skating instead.

There’s a difference between the research that you conduct in class and the research that is conducted by projects like this one. In class, the data collection process is quick and usually painless, and after a few weeks you’ve got a ten page paper to prove you did the work. I loved the feeling of accomplishment, and after three years in school, I began to dream about the actual impact I could have one day.

But the issue with projects of this size is that the work doesn’t last a few weeks. Long after I have graduated, this project will finally see its database completed. The odds of me being there to see the fruits of all of our labor is very small. What I am here for is the day in and day out of the project: coding and recoding, searching the internet for any source I can find more credible than a random local newspaper, and talking out the nitty gritty details. As a big picture thinker, this was a huge change of pace for me.

I know that the job of coding the cases is obviously essential to the project. But especially as someone who expected to waltz into the room and become the star of the research world, there’s a little something to be desired. However, as a senior, I know that this won’t be the last time I feel this way. As I move into my first job (which is hopefully in research as well) I know that I’ll be starting at the bottom of the totem pole. There’s a good chance that I’ll be in a supporting role for a long time before I see myself becoming the professional that I want to be one day.

You’ve got to start at the bottom to make it to the top. So here are a few pieces of advice from someone who just finished their first semester on a research project:

  1. Get used to having 20 tabs open at once. Find a case, check it with the spreadsheet, check it with another spreadsheet, file the case files, have I enticed you yet?
  2. Know that the way you see things isn’t always going to be the right way to see things. In my case, it rarely was. Take advantage of the bright minds around you and learn how to look at things from different angles.
  3. Don’t take offense at someone pointing out your mistakes. Especially at the beginning, being the best you can be means taking criticism. When you give criticism, you don’t make it personal, so don’t take it personal when you’re on the other side.
  4. ASK QUESTIONS. Being the person who think they’ve got it all down when they don’t is far more embarrassing than having the humility to admit that you need help. All of your teammates will also be grateful that they don’t have to go back and fix your mistakes.
  5. Make yourself useful. There will be days where, in the course of your normal work, you find mistakes. Don’t be the one who turns a blind eye, be the one who goes in and fixes it.

Now I’m obviously not speaking as an expert. My one semester is basically a minute in comparison to years that real professionals spend. But as the college equivalent of a grandfather in rocking chair, know that starting at the bottom is okay and it will eventually help you as you move up. I think that this experience has made me a better teammate, a better learner, and an overall better worker. Research isn’t always the drama associated with changing the world, sometimes it really is just the drama of disagreeing with your coding partner.

Remote Coding: a Future Idea for the Prosecution Project


This continues our series of student reflections and analysis authored by our research team.


Remote Coding: a Future Idea for the Prosecution Project

Meekael Hailu

As I have been with the Prosecution Project now for roughly 5 months, I feel like I have developed a great feel for the overall workings of the project itself, although I have always wanted to challenge myself to learn more. As we round out the end of the semester, my mind begins to wonder how far this project will move into the future, and what exactly we can do as a coding operation to excel its development.

As I began to think about the possible steps of growth the project can take, I thought of coding expansion. With a proper technique and educational foundation in how to properly code in this project, I fundamentally believe that this project can expand past just the institutional confines wherever Dr.Loadenthal is teaching at a particular time. With a fully-fleshed out application process that is selective, I feel as if that students at various institutions all across the country can join to be a part of the project, just as long as they meet the requirements to join.

This vision would require coding training, from start to finish, to be completely online. There could be some coding quizzes during the training process to verify that the instructions that are being given are being applied correctly. In addition to that if possible, the individual in the HR position could hold weekly or bi-weekly video calls with the newly brought-on coders to check in with and to help them throughout the process of being trained in how to properly contribute to the database.

In addition to that, for this future network that would be built remotely at various institutions across the country, there would be multiple assigned readings that would act as lecture replacements for the people who cannot physically attend a classroom-style instruction for the project. An example of the possible readings that could replace what we would learn in class. One example of this being utilized could be from the book titled Empirical Political Analysis.

This book breaks down the basics that will be needed for the coding process, especially for beginner coders that might be brought onto this project in the future. It states:

“Whether the subject of the research is elections, political advertisements, news accounts, organizations, or anything else, you must be aware of the importance of the selection process and its implications for the meaning and usefulness of your research” (Rich, 2018).

This would be a brilliant piece of literature to require for the onboarding process. This is an idea I feel is something that with the proper preparation can be done with flying colors. This project produces the more hands are involved in the process itself.

With this idea, obviously, there would be more stringent policies for misuse of the project’s information and database. This ties back into the necessary component of having meetings with the coders to develop some sort of accountability system for coders that are not being actively monitored in a traditional classroom sense. An incentive that can be offered for academically proficient students that we would want as a part of this project is academic credit at their respective institutions if they offered that for students.

Also, there can eventually be individuals at these various institutions who have created their own the Prosecution Project chapter, and there can be a designed hierarchy and specific criteria that must be met for individuals who are filling those positions. For example, if someone has just joined, they would be classified as a beginner coder and be subject to the guidelines that we apply to them, and there could also be new positions that are developed in tandem with the remote coder system to ensure that the coding process runs seamlessly from beginning to end. A position like the one just described could be called a “data comber” someone who combs through the data that is submitted to us remotely just to make sure all the stringent criteria that are applied to them is met.

The longer I am involved with tPP, the more excited I become with where it seems to end up going. I am happy I was able to be part of such an organization, and I know developments like these are only a certain amount of time away.

References

Rich, R. C. (2018). Empirical political analysis: quantitative and qualitative research methods. New York, NY: Routledge.

Reflections from a first-semester team member


This continues our series of student reflections and analysis authored by our research team.


Reflections from a first-semester team member

Emma Lovejoy

I came on board with tPP in August 2019, and I think the project is amazing.  The issues we are investigating, the information we’re accessing, and the things our data can be used for in the future, all really exciting things.  As a brand-new coder though, there have been challenges with developing confidence in my work.

The biggest hurdle has been cultivating any sense of ownership: I don’t have an academic background that’s very applicable, and picking up the mechanics of coding and information management didn’t come naturally to me.  The process of doing so didn’t leave a lot of time or space for getting a working understanding of the data we already have.

Even after months looking at our spreadsheets, I don’t know how clearly I could talk about the data as it looks today.  That disconnect, while minor in the abstract, in practice has caused my work to come slower than it could, since every decision I come across (from coding to excluding cases) feels like something I’m not qualified to do.  It’s imposter syndrome, in a word, but I think there are ways to help new members prevent and overcome it in years to come.

In our team’s closing discussions we debated a restructuring of the roles for future student participants as a possible the first step.  Having smaller student groups with distinct goals and responsibilities, such as finding cases, updating existing cases, coding from scratch, and even document management with no data entry, would allow people to self-select roles that compliment their skill sets.  I really think allowing specialization like this without first requiring a certain volume of coding would help students to approach the project with confidence, rather than having to ‘wait and see’ if they find something to do that they enjoy and are good at.  While there is tremendous value in having diverse skills and proficiencies, being able to gravitate toward things that are instinctively interesting might provide a more accessible starting place.

A consistent, perhaps self-paced training program has also come up in brainstorming for the future.  There’s definitely something to be said for everyone receiving the same orienting information when they come on board, and creating a digital program would allow more hands-on practice before actually touching a new case. I think it’s likely that extra practice at whatever task one’s focus will be on would greatly reduce unnecessary mistakes that come of second guessing, and to some degree increase productivity for the same reason: choices are faster and easier to make when you’re confident in your understanding.

I’m sure at this point someone has decided I’m just complaining, and that if I’d spent more time with the data or the manual or the codebook I wouldn’t have struggled with these things.  And that may well be true for people with a background in data science.  But, coming to it as a new skill to learn, and primed in a much more narrative style of research, it’s taken me longer to get accustomed to the simultaneous speed and specificity with which we need to address each case. I think the changes I’ve mentioned, as well as any number of ideas that have and will come up in the brainstorming process, will help to streamline the work of tPP as it continues to grow, and new people continue to come on board.

On the hardship of managing big data


This continues our series of student reflections and analysis authored by our research team.


On the hardship of managing big data

Megan Roques

As the semester is coming to an end, it is essential to sit back and review the work that has been done. One of the tasks the class has taken on has been reviewing the validity of all of the project’s case files. The cases are separated by year and each student signed up to review one year worth of cases.

Before starting this assignment, I believed that the time spent on this task could be used in adding more cases to the database or scraping documents to find more cases. However, I was wrong. I did not take into account the hardship of management of big data.

It would be impossible for one person to go through all the folders in a timely manner. Additionally, keeping the source files up to date should be a task that is done often. Sometimes, a document that may have been accessible 3 years ago may not be available now because of a company shutting down. This instance could happen when finding sources from non-governmental entities such as finding an article from a local newspaper’s website. Once, a document goes missing, it should be a priority to replace this document with another one that contains the same (or even more!) information.

Most of the document replacing process was self-explanatory such as checking to see if the case was actually in our records or adding more documents if there are not enough. One of the steps, however, peaked my interest: make sure there is an official document. This step was kind of shocking because I have previously coded a few cases where the only information available are from news sources or journal entries. The official information of the case may be difficult to obtain because the records may be sealed, the case may be a state case so the information is not easily available, or the case is very recent (happened in the past ~12 months) and processes like appeals or additional sentencing are still in process.

While going through my assigned folder, it was fascinating how despite the new information that gets released over time our codes remain the same. Even in cases where the official documents get added to our files such as the Randy Linn case, the case is still “correct” based on the guidelines set forth by the project. Some of the minor details that get updated are things like middle names or the case is added to a group case. You might be wondering why the members don’t wait for the official documents to be released in order to fully code a case. It may seem time-consuming and redundant to look for more files after a case is already in the database.

The way the information-gathering process is set up to deal with large amounts of data. This process is best described by scholars like Jensen (2018) as scraping information from other databases, governmental files (such as the FBI), and social media sites like Twitter. Then, the scraped information serves as our “supplementary data sources” and the hunting for official documents begins. These supplementary data sources are key to coding a case because news sources or social media websites may give one insight about the “why” of the offender which is left out of indictments and charging documents. Therefore, the news articles and journal entries written about an offender can be more meaningful to the project than the official documents. However, the gathering of the official documents is still an important task since it serves as confirmation for sentencing and charging details.

Throughout this activity I have learned the value of different information sources and how they each have their own impact on the project. It really helped me realize that often not all the information that you would like is available for one particular case. Therefore, it is important to monitor certain cases so when official documents or more details are released one is able to retrieve those details.

References:

Jensen, Michael J. “Big Data: Methods for Collection and Analysis.” In Theory and Methods in Political Science, edited by Vivian Lowndes, David Marsh, and Gerry Stoker, 4th ed., 306-20. London, UK: Palgrave Macmillan UK, 2018.

Growth Means Change


This continues our series of student reflections and analysis authored by our research team.


Growth Means Change

Morgan Demboski

As the semester comes to a close, the project is now close to its fourth year. Since its birth, the Prosecution Project has evolved and improved through a lot of trial and error and collaborative brainstorming. On our last day of class for the Fall 2019 semester, the team split into small groups and discussed problems we found with the project, as well as came up with possible solutions to these problems. Being lucky enough to have been apart of this project since Fall 2017, I have witnessed how this project has grown, expanded, and changed for the better. However, one problem we continuously run into is keeping uniformity in the cases as the codebook, manual, and variables are altered. 

We run into new cases daily that make us reevaluate an aspect of our codebook. For example, until this semester, we had no ‘Threat/Harassment’ option underneath the ‘Tactic’ variable. We had the option of ‘Bomb threat/Hoax’ or ‘Armed intimidation/standoff;’ however, threatening a person/property for a social or political motive was not included in the tactics.

Furthermore, in the variable ‘Completion of Crime,’ we had the option of ‘Planned but not Attempted,’ ‘Attempted,’ ‘Carried through,’ and ‘Unknown,’ but we had no option in which a threat fell under. We were confronted with many cases that involved threats, such as William Patrick Syring from Arlington, VA who threatened employees of the Arab American Institute (AAI), and Rick Lynn Simmons from Grand Rapids, MI who left a racist, threatening voicemail to Democratic presidential candidate, Senator Cory Booker. As a result, we added the option ‘Threat/Harassment’ to the ‘Tactic’ variable and ‘Threat’ to the ‘Completion of Crime’ variable. 

This is one example of how our codebook and variables change constantly. However, when the codebook or something in the variables is altered, the cases that have already been completed are often left un-updated. Before we had the new variable, cases involving threats were always coded as ‘Other’ (recently changed to ‘Uncategorized’) in ‘Tactic,’ so when we added ‘Threat/Harassment’ as an option, these past cases needed to be changed. With the almost weekly modifications in coding language, variables, and formats, a lot of these older cases tend to slip through the cracks. To deal with this, my small group came up with an idea that we believe would help create more uniformity in coding past, present, and future cases. 

Currently, the project is organized into a class of students, a handful of independent coders, and a small group of senior team members. A large majority of the class, along with the independent coders, work on adding cases and creating new case starters, coding and expanding on uncompleted cases, and updating pending or completed cases. We have assignments throughout the semester that help us keep on track, with a total of around 30 case starters per student due by the end of the semester, around 30 completely coded cases per each student pair due by the end of the semester, and other assignments such as scraping source files or swapping out source documents. 

Though we have completed a lot of work this semester, our group suggests that it would be more efficient and effective to create different categories of coders in which students are organized into at the beginning of the semester. The first category is New Case Coders or coders who work solely on adding cases/case starters. They would focus on looking at Twitter and social media posts, the news, and other sources to find cases that have not yet been included in our database. They would then create a case starter, simply including the Date, Date Descriptor, Case ID, Name, and a Short Narrative. This role is great for members who have very little to no experience in coding with the project, and for members who prefer researching and finding new cases instead of deciphering specific variables.

We named the second category of coders Expanding Coders. Members in this category would focus on expanding those case starters and filling in the necessary variables. Working off of the source document and basic information that the New Case Coder found, the Expanding Coder will finish the case and prepare it to be checked and finalized by a senior coder. The third category, the category I believe to be most important at this stage in the project, is the Updating Coders. The members in this category will focus solely on updating previous and already completed cases with new information and new modifications to the codebook and manual. This would include updating pending cases with new information as well as updating finalized cases with newly found information or changes in variable language or codebook definitions.

The last category of team members, which already exists in the current structure, is the steering team, or senior members. This is a group of team members who have extensive experience with the project and have a deep understanding of the coding process. These members manage the more logistical side of the project, along with Project Director Dr. Michael Loadenthal, but they also have the important task of checking finished cases for mistakes and discrepancies and moving them to the completed (For Official Use Only) spreadsheet. 

After discussing our experiences with the project, each of my small group members had a preference of which category they would like to be a part of. We think that others in the project also prefer starting cases over updating and vice versa. By splitting people into their preferred categories, people can focus on a specific area rather than having to juggle both new cases and updating old cases. The first week of class (or training) can be spent teaching students what each role does and giving a small assignment that introduces the different roles so that students can decide their preferred job. New members to the team will generally be New Case Coders so that they can become familiar with the structure of the project and the coding process. Senior coders are typically designated before the year starts, and the rest of the team is welcome to choose whether they want to focus on adding new cases or updating old ones. By narrowing the duties of the members, each person can focus on perfecting their area and not worry about whether the other tasks need to be done. 

This is not to say that the members will work solely on new cases or old cases, there are a lot of tasks in the project still needing to be completed that are unrelated to coding entirely. As the project expands and more cases are added, our team has to reorganize and find ways to create uniformity among the thousands of cases in our database. Our small group’s suggestion was one among many offered by the team, each containing novel aspects that will grow and improve the project. We have excelled so far at making modifications to meet the fast-paced growth of the database, and we will continue to come up with ways to meet the changing needs of the team and the project as a whole.