I have a couple ideas in mind. I couldn’t get the code to work but I’ll have more time to attend to it this coming week.
I tried to run the code just looking at what’s returned in the SubjectHeadings.xml file. I might want to create a project that looks at the first time a subject heading was created, and the first object entered into the collection under that subject heading. Maybe I could bring in news headlines from that same time period. My question is – How does real life get into the collections?
This seems like a big project. I’m not sure how it would turn out! I’d have to interview a librarian about Subject Headings to get a better sense of how they work.
I was inspired by this LOC page, which lists all Subject Headings. The filter on the left seems to suggest LOC knows the general date for some, because I can look at only those made in the 2010s, for example.
While I did inspect the contents of .xml Subject Heading files, I am not sure whether I am seeing records of Subject Headings, or records of objects in the collection? I’m pretty sure it’s the first thing, but it’s not clear.
Either way, I want to know if LOC stores the year Subject Headings are created, and if not, how I might look for the earliest item in the collection for each subject heading.
I struggled to get the code to work, see my error below. It must be something fairly simple that’s not connected in my code.
I’m also curious how many people are in the LOC’s collections, but not in Wikipedia? It might be interesting to compare the two and see if there is any difference. I could limit it to a certain period of time, occupation, or gender. At the end, I might share the list with groups who get together to add entries to Wikipedia.
I’d need to keep in mind that this idea is more about critiquing Wikipedia than the LOC.
I could flip it, and see how many people on Wikipedia are not in the LOC.
Examining the LOC with a Critical Eye
Along the lines of the last idea, I find it hard to critique the LOC’s holdings without comparing it to something outside it, which means bringing together two data sets. I’m interested in trying it but want to talk through what this looks like.