Source Scraping and Coding Confirmation


This continues our series of student reflections and analysis authored by our research team.


Source Scraping and Coding Confirmation

Emma Lovejoy

When we look for new cases to include in the tPP database, it’s always helpful to find an existing compilation we can pull names and data from.  While cases in some existing databases automatically meet the criteria for inclusion (lists specifically dedicated to terrorist acts, political extremism, etc), for other sources each case must be individually investigated, and a determination made as to whether they should be included in tPP or not.  For sources like this, it’s important that we take the time to ensure we’re not adding extraneous cases, and that the information being added as up-to date.

When a case is located that we think may need to be included, the first step is to check the defendant’s name against each of our active spreadsheets, to ensure the case really is new to tPP.  In cases where we’ve already documented the case, this is an opportunity to see if there are updates to be made; if variables we’d had trouble with in the past are clarified by new source material, if the trial has progressed, etc.  If the case does not appear in our data yet, we move on to source collection.

An easy place to start when a case’s inclusion is still questionable is looking at news stories.  They’re usually easier to find than court documents, and can give a general picture of the crime and the defendant.  Based on what we see in the news, we can usually make a judgement at least on whether or not the case meets the criteria for inclusion, even if the details of their ideology remains unclear.  If it is a case for tPP, especially useful articles will be saved as source files, and known information (name, dates, location etc) as well as the dataset it was originally pulled from will be added to the working spreadsheet as a case-starter, to be coded.

If the case could have been included were it not for an exclusionary factor (charges not reaching felony-level, death prior to charging, etc) then the basic case-starter information is filed separately as an excluded case, with sources and an explanation of why.  We do this to save ourselves time down the line, if the case should come up again in the future.  Given the overlapping content of various datasets, it’s not uncommon for an excluded case to to raise flags on more than one occasion, so having this index improves our ability to work through external data efficiently.

Most recently, we have been working on developing a collective procedure for the scraping process.  That is, a system in which each stage of examination is assigned to an individual or group, to expedite the scraping of each document. So far, this assembly-line approach has yielded hundreds of new cases to be incorporated.

New Spring 2020 tPP Syllabus Available!

As the Prosecution Project has grown and changed, it has benefitted from utilizing the classroom. We have used the shared space as a laboratory, workshop, assembly line, and debate stage for developing our processes and concepts. Nearly each semester we have been able to host a twice a week class to get students in the same space working on tPP.

We are happy to share the Spring 2020 syllabus focused around advanced qualitative coding

To view Dr. Loadenthal’s past tPP syllabi focusing on advanced secondary research, project management and design, and analysis see:

 

The Case of Keith Luke (Part 1 of 3: general overview & background)


This continues our series of student reflections and analysis authored by our research team.


The Case of Keith Luke (Part 1 of 3: general overview & background)

Caitlin Marsengill

This journal is the first in a three part series on the case of Keith Luke. The first journal is a general overview and background information on the case. The second journal will be on the prosecution and legal proceedings surrounding the case, and the final journal be an analysis of the case. As a disclaimer this case is particularly egregious as the crimes he committed sexual assault, violent, and racially motivated.

Keith Luke is a neo-nazi and white supremacist from Brockton, Massachusetts. He went on a rampage that killed two people and shot and raped another person.  All three victims of this crime were of Cape Verdean descent and he sought them out because of their race. He premeditated the crimes for months. Other than the racial motivations, he also committed the rape because he said that he had been turned down “100,000 (expletive) times” and he did not want to die a virgin. While he was sexually assaulting the woman, another woman came home and walked in on the act so Luke decided to shoot her. Then he shot the woman he had be raping and then got in his car, cranked the music up, and left. As he was driving down the street he spotted his next victim, who was a 72 year old homeless man pushing a carriage. Luke had bought a gun and over 200 rounds of ammunition. He had planned to end his rampage by shooting and killing bingo players at a synagogue. He was attempting to reload his gun while driving but was having difficulty then the police caught up with him and he attempted to shoot at them and then crashed his vehicle. Luke later said that he regretted shooting at the police officers because they were white.

Many details describing the unusual behaviors Luke exhibited throughout his life became evident in his trial from both testimonies as well as actions Luke himself took. These will be discussed more in detail in the next journal entry in the series. Luke was ultimately convicted of killing two people and raping and shooting a third person

Something that was interesting with coding this case is that he wanted to commit the rape so that he did not die a virgin, which initially caused some question as whether or not the case should be included in the project. However, it quickly became evident that this was motivated by his socio-political beliefs and that he had picked the victims due to their race and his hatred towards minorities after reviewing more sources. It was also interesting the varying amount of details different articles gave and it felt like no single article gave the entire story so we had to piece together the whole story from multiple articles that gave different details. Due to this, it constantly felt like we were finding new bits of information and details.

These problems we ran into while coding the case speak to the difficulty of using documentary data sources and how we have to be cautious about the sources used within our research and their credibility, reliability, representativeness, and ethics (Caulfield and Hill, 2014). There is clear bias within the way these articles present the facts, however that worked to our benefit in some of the articles as they went out of their way to include details about his motive, socio-political motive, and his ideological target. As Caulfield and Hill also discuss in their chapter, we can come to trust our sources due to the underlying facts of the case being a commonality in all of the articles and agreeing with other types of documents such as court records that we found.

 

References:

Caulfield, Laura, and Jane Hill. 2014. Criminological Research for Beginners: A Student’s Guide. 1st edition. Abingdon, Oxon: Routledge. [Excerpt: Chapter 10, “Using documentary and secondary data sources”]

It’s the Government’s Say (Part 2): On the Topic of ‘Other Status’


This continues our series of student reflections and analysis authored by our research team.


It’s the Government’s Say (Part 2): On the Topic of ‘Other Status’

Emily Ashner

When coding for the case of Abdirizak Haji Raghe Wehelie, a federal contractor for the FBI and worked as a linguist translating communications captured by court-authorized surveillance of a suspect in a terrorism investigation, the DOJ released his middle name as “Jaji”, likely on accident, but for those who know Ararbic this has a very a different definition. While this is not a major issue, there are ways the government utilizes Muslim and Arab names and references that influences workers within the government and citizens.

The Institute for Policy and Understanding released statistics that showed the severity differences between Muslims and non-muslim perpetrators who committed the same crime.1 These statistics show that Muslims receive a severe punishment 83% of the time while non-muslim only 17%. Further, average prison sentences are four times higher if the perpetrator is perceived to be Muslim. This is why in the Prosecution Project we code for “other status.” The codebook defines this status if they meet any of the following: Does the defendant have a name not readily understood as European?; Is the defendant Muslim or a Muslim convert?; Is the defendant an immigrant from a non-Western/European country?

Clearly, the way government officials view perpetrators has an extreme effect on how they are sentenced. However, the way this information is presented also has incredible implicit bias effects on citizens, who are potential jury members. According to a study on implicit attitudes towards Arab-muslims, participants showed an implicit negative attitude towards Arab-muslims over both whites and blacks.2 Interestingly, prejudice could be moderated if the participants were exposed to positive values of Arab-muslims first.

There is a wide understanding in the negative role the media can play, but government documents that immediately label a person’s race or religion has clear effects on attitudes. Understanding the availability heuristic, the tendency to apply the group first thought of when  a statement is seen or heard, can be important in conscious understanding of remaining unbiased. Once again, our process for the project is quite objective so coding is not affected, but is important to consider when thinking about how we can apply the results of our coding and other similar projects to mediate the discriminatory nature of justice system sentencing.

Notes

  1. Rao, Kumar, Carey Shenkman, Khwaja Ahmed, Hasher Nisar, Dalia Mogahed, Sarrah Bugageila, Katherine Coplen, and Katie Grimes. “Equal Treatment? Measuring the Legal and Media Responses to Ideologically Motivated Violence in the United States.” Washington, DC: Institute for Social Policy & Understanding, April 2018.
  2. Park, Jaihyun, Karla Felix, and Grace Lee. 2007. “Implicit Attitudes towards Arab-Muslims and the Moderating Effects of Social Information.” Basic and Applied Social Psychology 29 (1): 35–45. doi:10.1080/01973530701330942.

Preparing a report for the US Army War College

In December 2019, tPP was contacted by an individual with the United States Army War College seeking assistance with data. This individual asked if we could provide a number of measures of criminal defendants in order to compare defendants with and without military histories.

After a few emails and a phone call to ensure we could deliver the desired analysis product, our team went to work and in a few days completed two reports: “Active duty vs. discharged veterans and international vs. domestic affiliation in the Prosecution Project (tPP) dataset & Veteran versus civilian comparisons in the Prosecution Project (tPP) dataset”. We are pleased to share these reports here for others to review.

The Prosecution Project team is happy to assist individuals and institutions when our data can be useful. If you have a query which could benefit from tPP data, please let us know!

Inter-Coder Reliability Between Projects


This continues our series of student reflections and analysis authored by our research team.


Inter-Coder Reliability Between Projects

Stephanie Sorich

During the week of November 18, many members of the Project were given an assignment focused on “scraping” documents. Essentially with scraping, coders search the names of defendants for potential cases in our project database to determine if we’ve already coded a case, or if we’ve found a new case to code. Coders set up what we called an “assembly line,” several picking out names to be searched, several checking our spreadsheets to see if those names were already codes, and the last few began the cases that weren’t yet coded. It was a very efficient way of getting new potential cases on the board, even if they can’t be finished immediately.

However, rather than taking potential cases from police reports or news articles like is common in the Project, this assembly line was focusing on pulling potential cases from other research projects. The Threat Within provided us with a spreadsheet of cases compiled on foreign terror, as well as several other lists from Homeland Security and other organizations. Being able to compare cases with other projects and organizations working within the same field provided a tremendous opportunity to clarify the reliability of our work.

To a certain extent, we can check for the reliability of our results within our own project. The practice of checking for inter-coder reliability, or making sure that separate coders receive the same result when looking at a case, provides insight into whether coders are using the same standards and coding by the manual in the same way. More reliable coding makes it more likely that the values being coded are valid, as multiple coders are finding the same end results.

However, coding correctly by the manual does not necessarily mean that the cases are being represented accurately. An issue of checking reliability and validity of the Project within the Project itself is the potential for groupthink. The inability to consult outside minds or consider other perspectives on coding outside those in the room with us each day could cause coders to accept variables and values as true rather than as things to be changed to better fit the project as it develops. Therefore, we took this scraping activity as an opportunity to check our results based upon cases from the lists provided that the Project had already coded for. In many cases, we found that our coding matched that of other projects or organizations, giving us sufficient evidence to believe that our methods of coding and the variables and values we are using are adequate.

In a sense, checking results between projects becomes its own method of analysis. While mostly used as a means to complete the research already being done, performing this comparison could be used as its own analysis between the results of tPP and other projects in order to get a grasp of the general ideas behind terrorism research. While not necessarily for publishing, it could prove to be a useful tool when further evaluating our coding process.

At the moment, it looks as if we are on the right track based upon the comparisons done between our data and others’. However, continuing to scrape new cases from other organizations and compare those done mutually between projects should and likely will be a priority of the Project. While we are working with a group of capable individuals who work carefully to produce the best results, breaking away from the group to get an outside perspective is the best way to awaken the parts of us that take our decisions as absolute with no room for suggestion.

For more information on groupthink as it pertains to team projects: https://pdfs.semanticscholar.org/be0c/9ebe64eb4c77673404f77f08cdc5600f97ef.pdf

My Year in Review


This continues our series of student reflections and analysis authored by our research team.


My Year in Review

Emily Ashner

As the 2019 year wraps up I approach my one year start with tPP. Looking back to when I asked to join the project at this point last year, I was at such a different knowledge level in terms of terrorism and extremism. I truly did not understand the meaning of these terms, the extent of domestic terrorism, and the variance of crimes that fall within our requirements on the project. After coding cases for two semesters, I have developed such a strong comprehension for terms of the court, how the prosecution process works, and the details of crime that is so often discussed in the media. I feel so much more connected to current events and a more active member in this realm.

Working on this project has not only provided me with a knowledge base of specific on the projects but has given me so many applicable skills. Big data is the new norm; efficient processing is being used across many domains. The ability to understand not only how this type of processing works, but the opportunity to add to and adapt the system are long lasting skills. The allowance of creativity and input within this process has built a new approach to finding best practice and expanding data in different ways. I appreciate the underlying aspects in which this project has strengthened skills that are applicable far beyond the prosecution field.

As a psychology major, I was unsure of how knowledge in this field would pave my future interests. In an unexpected manner, tPP created such a strong interest in the effects of bias. Working within judgment and decision making research I have understood how strong the influence of bias is. However, I did not understand application until a saw an intersection between this and the results of prosecution in the United States. Not only are judges, jury, and other members within the criminal proceedings driven by personal prejudice, whether conscious or unconscious, but these outcomes therefore influence the bias of society. There are endless cycles found in systemic discrepancy and the only way to break them is to first be conscious of their existence and then act in a manner that opposes this automatic process. Cycles are dangerous in creating continuous disadvantage. Whether the aim of the project was meant to discover this or not, the numbers bring light to the situation. There is such a strong ability of application that can arise from this project and I am excited to see where others may take this and where I can utilize a similar type of information moving forward in my career.

Discussing acts of terrorism brings such a strong emotional component. The large acts that are quickly associated with such are events that affect so many people. These are events that can connect us, but it is important to realize these are also events that are isolating those not responsible. I thank this project for this realization of such. I thank this project for giving me an accessible outlet to gain knowledge on the current state of this type of event and more so the ability to analyze them objectively. tPP has provided me with so many skills I will carry much past this project and I hope students of all majors will take advantage of such a unique project. Dr. Loadenthal and all members of the project are so dedicated and are creating something truly incredible. Thank you tPP, I will miss working on the project but have no doubt of the future success to come!

tPP’s Research Showing Up in the News

We’re very pleased to see our past analysis mentioned in this recent article by the Philadelphia Inquirer, “Republican death threats are undermining our democracy.

We’re always happy to work with journalists, academics, investigators, policy makers and practitioners who can make use of our data. Our goal remains the creation of a free, open source platform for knowledge construction, and until we’re able to make the entirety of the data set public, get in touch if we can be a resource.

The Realities of Research Work: A semester reflection


This continues our series of student reflections and analysis authored by our research team.


The Realities of Research Work: A semester reflection

Stephanie Sorich

When I pictured my first time participating on a research team, I pictured something grandiose: Beautiful Mind-styled webs of ideas pasted around on the walls, philosophical discussion about the social consequences of our results, and a glamorous story to tell my friends and family at home. I ended up with something different.

Grunt work. Lots and lots of grunt work.

At least, that’s how it felt originally. Coming onto this project, the first few days were mostly just filled with confusion over the newness of everything. Combine this with the feeling of combing through source files for the very first time and the tedious feeling of trying to navigate the team’s drive. After week one, I was kicking myself for not taking ice skating instead.

There’s a difference between the research that you conduct in class and the research that is conducted by projects like this one. In class, the data collection process is quick and usually painless, and after a few weeks you’ve got a ten page paper to prove you did the work. I loved the feeling of accomplishment, and after three years in school, I began to dream about the actual impact I could have one day.

But the issue with projects of this size is that the work doesn’t last a few weeks. Long after I have graduated, this project will finally see its database completed. The odds of me being there to see the fruits of all of our labor is very small. What I am here for is the day in and day out of the project: coding and recoding, searching the internet for any source I can find more credible than a random local newspaper, and talking out the nitty gritty details. As a big picture thinker, this was a huge change of pace for me.

I know that the job of coding the cases is obviously essential to the project. But especially as someone who expected to waltz into the room and become the star of the research world, there’s a little something to be desired. However, as a senior, I know that this won’t be the last time I feel this way. As I move into my first job (which is hopefully in research as well) I know that I’ll be starting at the bottom of the totem pole. There’s a good chance that I’ll be in a supporting role for a long time before I see myself becoming the professional that I want to be one day.

You’ve got to start at the bottom to make it to the top. So here are a few pieces of advice from someone who just finished their first semester on a research project:

  1. Get used to having 20 tabs open at once. Find a case, check it with the spreadsheet, check it with another spreadsheet, file the case files, have I enticed you yet?
  2. Know that the way you see things isn’t always going to be the right way to see things. In my case, it rarely was. Take advantage of the bright minds around you and learn how to look at things from different angles.
  3. Don’t take offense at someone pointing out your mistakes. Especially at the beginning, being the best you can be means taking criticism. When you give criticism, you don’t make it personal, so don’t take it personal when you’re on the other side.
  4. ASK QUESTIONS. Being the person who think they’ve got it all down when they don’t is far more embarrassing than having the humility to admit that you need help. All of your teammates will also be grateful that they don’t have to go back and fix your mistakes.
  5. Make yourself useful. There will be days where, in the course of your normal work, you find mistakes. Don’t be the one who turns a blind eye, be the one who goes in and fixes it.

Now I’m obviously not speaking as an expert. My one semester is basically a minute in comparison to years that real professionals spend. But as the college equivalent of a grandfather in rocking chair, know that starting at the bottom is okay and it will eventually help you as you move up. I think that this experience has made me a better teammate, a better learner, and an overall better worker. Research isn’t always the drama associated with changing the world, sometimes it really is just the drama of disagreeing with your coding partner.