This continues our series of student reflections and analysis authored by our research team.
Last week we had another busy meeting in order to get all of our ducks in a row before we delve into analysis. We addressed some issues with coding, discussed ways to publicize our efforts, and talked about things that we as individuals can focus on within the data.
One thing we spent a large chunk of time on was addressing some issues that have come to light concerning the coding variables – both the variables themselves and the options (i.e. coding values). For one, as the project has gone on the variables that we are coding have changed; some have been added while others have been adjusted for the values that we’ve found. This has caused a discrepancy in how cases are being coded. Some of the older cases that were coded and verified at the beginning of the project are now missing variables, such as the “other” variable (see Athena’s blog post)! These cases in the database as complete, when they are in fact currently missing information.
Similarly, there was a slight lack of communication amongst the team throughout the coding process causing other variables to remain uncoded or incorrectly coded. “Charges,” for example, proved too difficult to fit into a pre-set list of options, so it was decided that instead the exact charges for each case should be copied into the database.
This same challenges with team-wide communication also lead into the discussion that was held on the variable “other.” The “other” variable was added later on in the project, so it was a two-faced problem. Part one, a good portion of cases simply never were coded for the variable. Part two, there was a less-than-clear definition of what “other” meant when it was introduced.
We attempted to address this as a team and ran into the issue that calling someone an “other” is inherently something that is different person to person. The original idea was to capture the mindset of your typical American jury, but we had trouble even coming to a consensus on what that looked like. As a team, we also struggled with the ethics of trying to take a subjective determination such as “other” and turn it into a variable. After all, who are we to decide what someone who doesn’t fit in looks like? We spent a good portion of the class debating what the function of the variable is, and to what extent we could make assumptions about other people’s mindsets when coding. If we as a team can’t find a middle ground on how to code for “other,” then why are we trying to? And with the number of assumptions that we are making in order to code for the variable, is it even worth coding for? In the end, we decided that it does have merit to help see generalized data trends, but it was a long discussion that led us there.
To wrap up, we tried “live coding” as a class for the first time this week! In an attempt to show a more real-time example of how the coding process is completed, we coded a few cases as a class with the hopes to upload a video of our team work soon!