Algorithmic Authority, Emergent Biases, and Implications for Information Literacy

Algorithms have become increasingly ubiquitous in our modern, technologically driven society. Algorithmic tools that are embedded to “enhance” the user experience when information-seeking carry problematic epistemological concerns. These algorithms are developed and interjected into search tools by human beings who, consciously or not, tend to impart biases into the functionality of the information retrieval process. These search tools have become our primary arbiters of knowledge and have been granted relatively unmitigated sovereignty over our perceptions of reality and truth. This article provides broader awareness of how the bias embedded within these algorithmic systems structures users’ perception and knowledge of the world, preserving traditional power hierarchies and the marginalization of specific groups of people, and examines the implications of algorithmic search systems on information literacy instruction from a critical pedagogical perspective.


palrap.org
is increasingly expressed algorithmically (Pasquale, 2015). Implementations and technical specifications of algorithms tend to be nebulous and are in a near constant state of change, which is often related to the continual striving for improved "performance" (Reidsma, 2019). This constant state of updating and altering a search algorithm's implementation raises another set of issues related to which aspects are interpreted as "performance" pieces, and which aspects of those pieces are prioritized. This ever-changing state of implementation can make it difficult for an information seeker to maintain a concrete understanding of how a given algorithmic system is functioning and why certain search results appear on their screen.
These issues must be discovered, explored socioculturally, and considered ontologically when examining the implications for information-seeking and enhancing one's capability as an information literate researcher, scholar, and teacher. Information literacy librarians must engage in intentional, critical pedagogical teaching practices to assure that students, as budding researchers and information consumers, are equipped with a critical perspective to successfully interrogate the complex algorithmic systems that have become essential to our information ecosystems both on and off campus.

Literature Review
The literature reveals multiple issues related to transferring authority entirely to algorithmic systems in the context of information retrieval. According to Reidsma (2019), the secrecy surrounding the details of how an information retrieval system, such as a search engine, is implemented is a persistent issue, as it is typically the intellectual property of the parent company. Therefore, properly interrogating the details of how a search engine interprets the input data of a user, utilizes that input data for retrieval, as well as how it locates, retrieves, and displays output data is difficult. These implementation details are often either purposefully obfuscated or impenetrably opaque, which obscures a true understanding of functionality and results in an epistemological crisis in which authority is given without properly understanding if it should be (Reidsma, 2019). Seaver (2019) discussed the issue of how these systems digest and distill complex data that represent complex phenomena in the real world into simplified, mathematical representations of those phenomena. In order to function, algorithms must translate complex social phenomena into mathematical inputs and outputs, in many cases, deriving a "fuzzy" approximation of reality, which may lead to a disconnect between user intent and algorithmic translation and result in problematic or erroneous search results (Reidsma, 2019). Dormehl (2014) detailed a third area of concern: the common misconception that algorithms are dispassionate, neutral arbiters of information retrieval despite being implemented by human beings that can and do embed biases and perspectives into the code they write. According to Martin (2008), software systems are generally designed, developed, and tested in individual modules: Each team builds and tests a specific functional part of the whole, typically in isolation from the rest of the system. These modules are then merged into a whole system prior to full release and implementation, resulting in a siloing effect in which developers may not only be interjecting their perspectives into the code of one module but also into the performance of the whole system. Thus, the outputs of a fully realized system in multiple search contexts may be extremely difficult not only to predict but also test comprehensively (Kearns & Roth, 2019). This uncertainty is particularly true in systems that utilize machine learning

palrap.org
problematic examples with commercial searches, such as googling "Black girls" and retrieving several pages of pornographic representations of Black women as well as "Jew" and receiving an entire first page, the top results via the Google PageRank algorithm, with anti-Semitic websites. Reidsma (2019) evaluated several library discovery systems and found peculiar results related to algorithmic biases. One in particular occurred when he ran a simple search for "racism" in EBSCO Discovery System, and the initial results in its topic explorer displayed only information related to the "scientific racism" movement, which clearly can be misleading. He also discussed ProQuest Summon and its topic explorer algorithm, which surfaces results at the top based on the keywords entered. One of the many disturbing cases was a search for "African history" that returned results exclusively related to African American history, essentially disregarding the history of African people, centering these search results only on the history of these people within the context of United States history, and completely ignoring the colonial and post-colonial periods of African people residing outside of the United States.
Algorithmic systems are also guilty of problematic autosuggestions for users. Reidsma (2019) pointed to Summon completing the sentence "Muslims are….." with "terrorists." There are examples of these conflations and overtly racist, sexist, indecent, or otherwise questionable suggestions in commercial search results as well. Noble (2018) pointed to a search of "Why are Black women so…." that Google autocompleted with "angry," "loud," "mean," "annoying," and "lazy." These negative representations were depicted as the cover art for her book, but Google has corrected these distressing autosuggestions since publication. Bucher (2018) added other concrete examples of algorithmic bias, such as Google's "gorilla incident" where an image search for "gorilla" returned a significant number of images of African Americans as well as Amazon's well-documented practice of excluding predominantly African American zip codes from same-day delivery service. Sweeney (2013) pointed to Google's penchant for displaying ads for searchable proprietary services to discover arrest records when predominantly Black names are entered as keywords. Caliskin, Bryson, and Narayanan (2017) found that widely used language-processing algorithms themselves, trained on human writing from the Internet, reproduce human biases along racist and sexist lines.
The literature examining the intersection of racial, sexual, gender, and class biases embedded into the fabric of our increasingly technocratic society is clear, offering many examples of problematic biases within a sociocultural context that have been replicated within algorithmic decision-making. Therefore, individuals who design, build, develop, and code algorithms within these systems are, in fact, knowingly or unknowingly incorporating these biases into software systems that are authorized to make choices that have effects in the real world in a myriad of ways (Benjamin, 2019).
According to the Pew Research Center's analysis of the U.S Census Bureau's American Community Survey conducted between 2014 and 2016, there are 4.4 million computer workers aged 25 and over who are employed fulltime in the United States; women represent only 25% of this workforce, and Blacks a mere 7% (Funk & Parker, 2018).
The survey also showed that the computer workforce is overwhelmingly white and Asian males. The result of this inequity is a limited number of perspectives involved in the development and implementation of algorithmic search and retrieval systems, which has resulted in the adherence to and the perpetuation of unequal power structures with regard to economic status, race, gender, and sexuality in society (Noble, 2018).

palrap.org
Search results can obscure any struggle over understanding, mask history, reframe our thinking, and deny us the ability to engage deeply with essential information and knowledge that has traditionally been learned through teachers, history, books, and experience (Noble, 2018). Software systems are built on several layers of mathematical abstraction, making it difficult to generate an exact one-to-one relationship to the real-world entities they attempt to describe. By returning "objective" search results that are implicitly trusted but based on imprecise models, search results can structure and determine a perceived identity for groups of people who may already be marginalized in a society (Reidsma, 2019). This situation is troublesome not only due to proliferating misleading perceptions of reality, based on biased search engine results, but also due to algorithms either implicitly or explicitly hiding valuable information (Benjamin, 2019).

Considerations Regarding the Proprietary Nature Information Systems
As information progressively moves from the public sphere to private control by corporations, we increasingly arrive at a critical juncture in which the quality of information and the ability of the public to sift through it is at stake (Noble, 2018). Information, as with many other industries that have been monetized by for-profit corporations, carries a significant amount of weight in shaping public conversation and potentially a common public understanding of reality itself (Bucher, 2018). Social media has added another structural layer: Monetizing information based on a quantification of traffic directed towards it, which is rife with ethical concerns related to algorithmically favored information being presented based solely on its popularity rather than its relevance or accuracy (Bucher, 2018).
This practice can also be found within the gaming of page rank algorithms, such as Google's, in order to drive information or a specific website to the top of the search results (Bucher, 2018). This process could infect scholarly databases to cause a myriad of issues related to citation metrics and "best match" search results filters. There are potential ethical concerns related to publishers who also produce database search and retrieval systems that favor their publications over other vendors'. This shift towards corporatization further erodes societal protections vital to a democratic society and ensures a privileging of information through commodifying access to information that was once freely available (Noble, 2018).
In the library acquisitions context, vendors piece together large subscription packages to be purchased by individual libraries or consortia. This clear commodification of scholarship specifically privileges wealthy institutions over those with smaller budgets, eroding information access and furthering the stratification of higher education institutions into "haves" and "have nots." Therefore, students at a smaller institution are able to discover information through a library database search but may not be granted full-text access from whichever curated database subscription package their institution's library can afford to purchase. The control of information access, a very powerful social responsibility, is consolidating: The indexing, algorithmic searching, and sorting of information is being handed to an increasingly small number of individual corporations and software developers (Noble, 2018).

palrap.org
2018). It prompts students, researchers, and professionals to engage in critical conversations around issues of authority, which is a central problem area identified by the scholarly literature regarding algorithmic search systems. This critical perspective is essential to information literacy within the current and ever-evolving information ecosystem. An expanding mountain of evidence in the scholarly literature indicates that algorithmic systems must be understood and critically interrogated in order to truly develop into an information literate researcher. A clear and thorough understanding of how these algorithmic systems function can only arise through examination via a critical perspective and consciousness.
The framework contains two essential frames to be considered when critically interrogating algorithmic systems and their potentially problematic nature during information literacy instruction. The first is Authority is Constructed and Contextual, which is explicit about the socially constructed nature of authority and argues that information literacy includes the ability to "acknowledge biases that privilege some sources of authority over others, especially in terms of others' worldviews, gender, sexual orientation, and cultural orientations" (ACRL, 2016). This frame effectively places the idea of authority as rooted in context and subject to challenge (Drabinski, 2017).
Following critical dialogue, the concept of the influential pedagogical theorist Paulo Freire (1970), there are several practical implementations for classroom work that engages students to develop their perceptions from a critical conversation to a critical consciousness of this frame. Interrogate algorithmic authority, using active learning techniques, such as breaking students into small groups for critical examination and discussion (Swanson, 2004).
Prompt student groups to search for the same keyword across multiple search engines that function in different contexts, such as Google, an academic database, and a library discovery system. Each student can choose one of the search engines and answer problem-posing discussion prompts related to the nature of authority. The librarian instructor can pose questions, such as:

•
What entities influenced or had a hand in the way this search engine functions?
• What information is shown to you?
• How could the outside entities you've identified influence the way these search systems are built?
• How did your search results differ from the other students in your group who used other search engines?
• Why do you think these results are different when you've used the same keywords?
• In this search context, who is expressing authority over which information has value and should be shown to you, the user?
The objective of the questions is to lead students to critically discuss authority and have an open dialogue about how the algorithmic systems interacted with made choices for them and expressed authority related to the

palrap.org
The third frame, Information Has Value, is also pertinent when introducing a critical understanding of algorithmic search and retrieval systems and how they function within our current information ecosystem. It invites students to examine the political and financial implications relevant to information commodification and its production, circulation, and access (Drabinski, 2017). This frame also mentions that intellectual property is a legal and social construct that varies by culture and that students should consider themselves agents of social change and contributors to the information marketplace rather than passive information consumers (ACRL, 2016).
Instruction should help students develop a critical understanding of algorithmic search and retrieval systems, understand the implications of the systems' proprietary nature, and recognize that authority has been passively and nearly unknowingly given to algorithmic systems. Librarian instructors can again engage in problem-posing to incite critical conversations among students. For example, students could investigate journals and publishers to ascertain how they receive funding and how those sources may be tied to larger publishing organizations. Students may also try to unearth how those organizations relate to creators of proprietary software products, such as library discovery systems, database curation, and platform creation, which may include search functionality. This search will lead students to critically examine how the commodification of these systems may be problematic in the privileging of information and preservation of marginalization and cultural hierarchies (Warren & Duckett, 2010). Students can work in pairs in which each student is assigned a scholarly journal that is published by disparate entities (ex. one student has an Elsevier or a Sage journal, and the second has either an open access journal or a journal published by a specific university press). These journals should each be drawn from the same subject area in order to find disparities within a common discipline. The students work together to critically examine the journal's funding sources and involvement in other industries related or unrelated to scholarly publishing, including their publishing platforms, databases they host, or their involvement in funding research. Next, students discuss problems, such as, "How might the intertwined financial relationships of these journals and their publishers have an effect on the articles published, their peer-review process, and the discoverability and accessibility of the articles?" Then, they can discuss concerns related to the monetization of journal access, publishers and platform providers giving financial support to researchers, and ties between publishers and industries related to the research they publish. This dialogue can create a disposition towards systematic critical information consumption. Again, questions and prompts for further discussion should focus on critical discussion, not merely deposit tokens of information, and problems posed should allow for open-ended discussion and critique among students before the librarian instructor engages in a larger group discussion (Tewell, 2018).

Conclusion
Biased results that appear in seemingly objective search tools are, in part, the consequence of treating everything as a mathematics problem, assigning numerical values to unquantifiable things, and accepting measurable proxies for complex concepts and ideas. Numerous scholars have pointed out the problematic aspects of these mathematical proxies and how they tend to amplify existing sociocultural structures that marginalize specific groups based on generalized socially constructed identities. The issue is not an inability to correct problematic algorithmic outputs and their consequences; developers must be held accountable for software system implementation, and software design must begin to account for and correct these issues. Algorithms manifest as objects of social concern, particularly as we grant authority for information access and selection to automated algorithmically driven systems.
Teaching librarians must reorient instructional curricula to include this increasingly vital knowledge related to algorithmic literacy. With the publication of the ACRL Framework for Information Literacy as a guide, the profession has moved towards framing information literacy instruction within the realm of critical pedagogy. Algorithmic systems are rife with opportunities to further the development of what Freire calls "critical consciousness."

palrap.org
When instructed on how to evaluate information for its scholarly merits, students must also understand how to critically scrutinize the process of how the algorithm may have chosen to display certain information while potentially hiding some that may have been valuable. In order to become information-literate students and citizens of the world, students must leave academia equipped with a sharpened critical consciousness to ensure that they are not only capable of but have a disposition towards a critical consciousness when evaluating information from a search system, how it functioned when queried, and the implications of its functionality. This not only will embolden confident, efficient, and thorough researchers but also inculcate students with a skill set that can and should be translated to information-seeking in all aspects of life. The practice of critical consciousness within information literacy instruction, and an advocacy towards ethical software design may gradually ensure that the problematic feedback loops created between our culture and the algorithms built within our culture disappear.