
The Dirty 30:
Tech’s Top Privacy Polluters
When their eyes open wide and there’s more eyes inside . . .
AI & Privacy
Welcome to the Dirty 30, a “worst of” list inspired by the EWG’s Dirty Dozen most pesticide-laden foods. The Dirty 30 is our curated list of privacy polluters, comprised of companies, software, platforms, and apps doing the most to exploit your data and undermine your privacy.
In the age of AI, data and privacy concerns are of heightened importance, as the large language models (LLMs) driving the AI boom that fuel chatbots, AI agents, and business insight tools are desperately thirsty for training data to help those models achieve more human-like capabilities.
Data is the new oil.
You need only look as far as your inbox. Have you received any notices of late from your SaaS providers regarding changes to terms of service? Companies are routinely expanding the types of data they collect and broadening the purposes for which they can use that data, including to train their AI models.
Add to this the concern that your confidential data may no longer be confidential. Once data is input and absorbed into the LLM, that data becomes fodder for answers delivered to other users who are not you.
AI models are also capable of more sophisticated profiling, which raises concerns around bias and discrimination, as well as increased risks for fraud or identify theft as data aggregators create digital clones that can pass verification tests. In the best case, maybe your Netflix recommendations are more relevant. In the worst case, you’re accused of fraud, arrested and are forced to file for bankruptcy.
Note that despite FTC warnings that surreptitiously changing one’s terms of service could constitute unfair or deceptive trade practices, not all companies are proactively notifying users of their policy changes. In fact, many companies state in their terms of use that they can change their policies any time and that your continued use of their service or website constitutes your consent. This may or may not be legal, but it is definitely not done with your best interests in mind.
To help you on your awareness journey, we’ve compiled this list of the worst data offenders. While some of these players are truly and objectively bad, any list will necessarily involve some subjectivity. We’ve listed the factors we considered most important in our assessment, along with our general methodology and reasoning, so you can make your own informed decisions.
Factors We Considered
🔹 Trustworthiness. What is the company’s track record around data collection, handling, and security? Is it forthcoming and above-board or sneaky and opaque? Does it have a trail of fines, lawsuits or breaches stemming from its data or security practices?
🔹 Data Minimization. Does the company overreach, collecting more data than is reasonably necessary to provide the service? Does it collect information about non-users or collect data from third parties to build a dossier on users?
🔹 Data Sensitivity. Does the company collect sensitive data, such as name, address, phone number, social security numbers, health information, financial or credit information, biometric data, location information, or behavioral or sentiment data?
🔹 Data Usage and Sharing. Does the company use your data for profiling or inferencing, share your information with third parties or an unknown number of “marketing affiliates,” or allow its employees access to review your data? Many companies like to brag that they never “sell” your data, but that doesn’t mean they aren’t sharing it.
🔹 Data Protection and Security. Does the company employ data security best practices, such as end-to-end encryption, multi-factor authentication, role-based access, and strong internal controls?
🔹 Privacy Impact. How many users are impacted by the company’s data and security practices? Our analysis is biased towards services that are widely used, especially in the United States.
🔹 Data Control. How much control do users have over their data via opt-out features and setting permissions and controls? How easy is it to opt out and toggle privacy features?
🔹 Consent and Transparency. How transparent is the company about what data it collects and how it uses that data?
Methodology and Logic
Our methodology prioritizes the first six factors, in that order. Here’s why:
🔹Trustworthiness ranks first because when a company has a history of shady practices and a trail of regulatory violations, fines, lawsuits, and whistleblowers, suffice it to say you are introducing a particular type of counterparty risk when you use their services. A company’s track record tells us about what they do versus what they say. It also speaks to how privacy-forward a company is, how seriously it takes its role as a steward of your data, how meticulous and competent it is in performing that role, and whether they see you as the customer or the product.
🔹Next, we look at the scope of a company’s data collection: if they don’t collect it in the first place, there’s not a problem!
🔹Even if a company does collect it, if it’s not particularly sensitive data or prone to misuse, not a problem.
🔹Even if it is a problem (in terms of the potential for unauthorized access or misuse), if the company is not actually using the data for an unreasonable purpose, perhaps we have only a small, contained, or even a theoretical problem.
🔹And if the problem remains small, contained, or theoretical, we can probably deal. But if the company amplifies the problem by indiscriminately sharing data (whether for $ or not) or having security lapses, now we get twitchy.
🔹And, finally, if the company has a massive customer base, this speaks to impact.
Our model assigns lower priority to the final two factors:
🔹We give points for data control features, but we’d rather not need to have to control our data. Rather, we’d prefer a privacy first approach that requires proactively opting in to any shenanigans.
🔹We also give points for transparency because it plays a role in informed consent, but a company telling us it plans to surveil us and manhandle our data before it actually does so it is cold comfort.
With that behind us, let’s dive into the dumpster:
Feed me your data.
Rankings
Trustworthiness is our highest ranking factor, so let’s start with the least trustworthy of the bunch. These companies engage in practices such as installing spyware/malware on your devices, conducting warrantless searches, lying to your face, and employing questionable practices in general.
How Did They Earn Their Spot?
Trustworthiness is our highest ranking factor, so let’s start with the least trustworthy of the bunch. These companies engage in practices such as installing spyware/malware on your devices, conducting warrantless searches, lying to your face, and employing questionable practices in general.
#1 Temu
First out of the gate, we have Temu, a popular Chinese e-commerce site that has been downloaded by 185.6 million users in the U.S., which equates to roughly 78% of the adult population. Temu has been the subject of multiple lawsuits, with allegations including: installing spyware and malware on users’ devices, failing to comply with security standards which can compromise users' financial information, and misleading users about how it collects their data.
The Temu app is alleged to be able access other data and apps on a user’s device based on the extensive access and permissions it requires. This means an employee who has downloaded Temu on their work phone has potentially exposed their company’s data. And if Temu is download on an employee’s personal phone and the employee accesses company data from that phone, such as checking their email, the company is also exposed. This should be very concerning for companies and individuals alike.
Maybe that air fryer isn’t such a bargain.
#2 Clearview AI
Haven’t heard of these guys? Allow us to introduce you. Clearview AI describes itself thus:
“We help law enforcement and governments in disrupting and solving crime, while also providing financial institutions, transportation, and other commercial enterprises to verify identities, prevent financial fraud, and combat identity theft.”
Sounds good, right? And in the right hands, it probably is. But we are talking about the government and profit-driven enterprises. You can read about the law enforcement concerns here, but we’d also like to highlight that if you have innocently shopped at Macy’s, Kohl’s, Target, Walmart, BestBuy, Albertsons, or Kroger, your face and other biometric data, along with other data Clearview has collected about you from myriad third parties, are likely in their database.
Clearview has been fined multiple times and has defended a number of lawsuits for alleged privacy violations in the U.S. and abroad arising from the use of its software. Macy’s also faced a class action lawsuit for spying on its shoppers with the Clearview AI software, which recently reached a proposed settlement. Clearview AI has now been permanently banned nationwide from making its database available to most private actors, including most businesses.
Is this really the AI future we’re envisioning?
#3 - 7 FAAMG
FAAMG refers to Facebook, Amazon, Apple, Microsoft and Google, which amalgamate vast quantities of data into a giant ecosystem. News stories about the big 5 tech companies and their questionable practices abound.
We have the Facebook Cambridge Analytica fiasco (among other scandals), Google’s attempts at not being evil, Amazon’s many transgressions, Microsoft’s myriad data breaches, including one particularly egregious self-inflicted instance resulting from a misconfiguration that caused 2.4 terabtyes of user data to leak, and Apple’s many troubles, despite allegedly being more privacy-focused.
Here is a sampling of their data collection practices:
Facebook (now Meta): Extensive and aggressive social and personal data collection practices, including collecting data on non-users (which they use to create shadow profiles), and collecting data from third party sources. Also sells user data. Having an ad-revenue driven model creates perverse incentives for Meta to accumulate and sell your data for targeting purposes, as Meta builds detailed psychographic profiles for advertisers based on social interactions and behaviors across its platforms.
Amazon: Extensive e-commerce and consumer tracking, focused on consumer intent, shopping behavior, and household data for personalized recommendations and targeted ads.
Search and purchase history (Amazon.com, Alexa); Browsing behavior on competitor sites (Amazon ads tracking); Alexa voice interactions & smart home activity; Streaming behavior (Prime Video, Audible, Kindle); Delivery and shopping preferences; Return patterns & product reviews; Shopping habits (e.g., bargain shopper, premium buyer); Product preferences & likelihood to buy; Voice command patterns & interests (via Alexa); Reading and media consumption habits; video and audio recordings (Ring cameras).
Apple: Apple tries to differentiate itself with privacy-focused data collection and minimal third-party tracking, but it still collects a lot of data.
Minimal ad tracking (Apple’s ad network, App Store interactions); Device analytics (iPhone, Mac, Apple Watch); Siri voice interactions (processed on-device where possible); Health and fitness data (Apple Health, Apple Watch); Apple Pay transactions (processed with privacy protections); App usage patterns (for App Store recommendations); and Health and fitness habits (Apple Watch, Health app).
Microsoft: Microsoft prioritizes enterprise data collection and productivity-focused tracking over consumer profiling, with less emphasis on direct advertising. Not none—just less. So still a lot.
Search history & queries (Bing, LinkedIn, Windows Search); Office document content (limited telemetry for AI features); Device & Windows usage data (Windows telemetry, Xbox); Communications data, including sentiment analysis (Teams, Outlook metadata); LinkedIn career insights & professional networks; Business professional profiles (based on LinkedIn activity); Tech and software usage trends (from Windows data); and Workplace collaboration habits.
Google (now Alphabet): Most comprehensive data collection Google’s ad ecosystem relies on detailed behavioral profiling across websites, apps, and devices. They have also misrepresented their data collection practices to users.
Search history & queries (Google Search, YouTube); Browsing activity (Chrome, Google Analytics); Location history (Google Maps, Android, IP tracking); Device and hardware data (Android, Nest, Fitbit); Voice recordings (Google Assistant, Nest, Pixel Voice); Emails and documents (Gmail, Google Drive, Docs); Purchases & transactions (Google Pay, Gmail receipts); Ad interactions (Google Ads, YouTube ads, third-party sites); App usage (Google Play, Firebase analytics); Third-party website behavior (Google Analytics, AdSense); Interest categories (e.g., "Tech Enthusiast," "Luxury Shopper"); Demographics & income estimations; Political & religious affiliations (inferred from behavior); Travel habits (based on Google Maps and searches); and Health & fitness tracking (Fitbit, Google Health).
#8 - 10 FAAMG-Enablers
Big tech would not be possible without its enablers—internet and mobile service providers that make data collection and transmission possible. The largest of these in the U.S. are Verizon, AT&T and T-Mobile. They are all pretty terrible when it comes to privacy and safeguarding your information, and both Verizon and AT&T have actively cooperated with the NSA’s warrantless domestic surveillance of U.S. citizens, though AT&T seems to have been the most obsequious in its efforts.
All of these providers have been the subject of multiple large data breaches over the years, which you can read about here, here, and here. On the plus side, they do offer opt out provisions and customizable privacy settings, but they are still overbroad in their data collection and sharing and permit third party tracking, which run counter to several of our top privacy factors.
#11 - 13 LLMs
To continue improving their models, LLM companies need vast amounts of data, well into terabytes and perhaps eventually into yottabytes, running counter to data minimization considerations.
LLMs such as Open AI’s ChatGPT, Google’s Gemini, and Anthropic’s Claude, have already ingested massive amounts of data scraped from the internet or obtained from third parties, without user consent and allegedly in violation of intellectual property rights, which certainly raises questions around trustworthiness. Add to that the vast amounts of data they require in order to continue training and fine tuning their models, which is the polar opposite of “data minimization,” our second most important factor.
Open AI’s data collection policy and practices are the murkiest, raising issues of transparency and consent. Anthropic’s is the most privacy-oriented, with users having to opt-in to their data being shared to further train the model, but no option to opt out of any past data that was used to train the model in the first instance.
Google is somewhere in between, and offers many third-party integrations, which is both very convenient--and very concerning from a privacy standpoint, as each third-party provider will have its own data privacy and use policies, creating a complex maze for users in terms of how data is handled. Since we already included Google in our rankings, we don’t count them here.
DeepSeek, a Chinese startup, released its free, open source LLM in January. It said to rival ChatGPT but was developed much more cheaply. DeepSeek quickly became one of the most downloaded apps. It has the same privacy concerns as other LLMs, but with the added layer of potential state surveillance and censorship.
#14 - 17 Social Media Apps
You knew this was coming--social media apps are not your friend. Included in this group are TikTok, Snapchat, Reddit and X (fka Twitter). Meta’s Facebook and Instagram also fall into this category, but we’ve already counted Meta in our rankings.
These apps raise a variety of privacy concerns due to their extensive data collection, targeted advertising based on personal information, location tracking, data breaches, potential for identity theft through data mining, account hacking, sharing personal details on profiles, cyberbullying, and the ability for employers to access user data from posts.
#18 - 19 Nissan and Kia
Automakers have been in the hot seat for invasively collecting telemetry data on drivers, including vehicle speed, acceleration, braking force, steering wheel angle, gear selection, location, fuel consumption, and sometimes even driver inputs like pedal pressure. GM was sanctioned by the FTC for sharing that data with insurance companies without user consent, leading to abrupt mid-year premium increases for some drivers.
In 2023, The Mozilla Foundation did an excellent review of the privacy and security practices of automakers and concluded that “Cars are the worst product category we have ever reviewed for privacy.” They found that all automakers engage in gross overreach in data collection and sharing, offer little to no user control over data, and present data security concerns.
Two particularly creepy automakers make our list, as they claimed the right to collect and share information about your “sexual activity” (Nissan) or your “sex life” (Kia). Though that language seems to have disappeared from their respective privacy policies as of this writing, it remains unclear whether they have retained any such data that they previously collected.
To whet your appetite, here is a nauseating sampling of some of the data Kia claims the right to collect about you, beyond just the usual account information, user preferences, and geolocation data:
Sensory data, including “audio, electronic, visual, thermal, olfactory, or similar information”
Employment data, including “current or past job history or performance evaluations”
Education data, including “education records directly related to a student maintained by an educational institution or party acting on its behalf (e.g., grades, transcripts, schedules, and student ID numbers)”
Biometric data, including “imagery of the iris, retina, fingerprint, face, hand, palm, vein patterns, and voice recordings, from which an identifier template, such as a face print, a minutiae template, or a voiceprint, can be extracted.”
Internet activity, including “browsing history, search history, and information regarding interactions with an Internet Web site, application, or advertisement”
Consumer activity, including “records of personal property, products or services purchased, obtained, or considered, or other purchasing or consuming histories or tendencies”
They also reserve the right to use the information they collect to draw inferences that may reflect “your preferences, characteristics, predispositions, behavior, attitudes, or similar behavioral information.”
If you have pearls, now would be a good time to clutch them. You might also think twice about syncing your devices with your vehicle’s onboard system, as well as taking sensitive business or personal calls in the car.
#20 Equifax
Equifax is a major credit reporting agency involved in one of the largest data breaches to date, losing the personal and financial information of 150 million consumers. The breach was due in part to unpatched software that was allowed to fester for months and in part because access credentials were stored in a plain text file on the server, which was then accessed by hackers. Following the breach, Equifax delayed informing consumers for weeks, such that consumers were unable to take defensive measures against fraud or unauthorized access to accounts.
Needless to say, if you’re going to collect highly sensitive data on virtually the entire U.S. adult population, maybe be a bit more meticulous about your data handling and security practices.
Given ongoing concerns with security, we highly recommend freezing your credit with all three credit reporting agencies, which is free and can be temporary or permanent.
#21 - 30 Popular Business Tools
We were tempted to include a number of other data and privacy offenders here, but we wanted to dive into some widely used business tools that may be of particular interest to our audience.
#21 Paypal
Paypal is an online payment system that dominates the digital payment industry, with 278 million active consumer and merchant accounts in the U.S. alone, nearly 20 billion transactions in the first 3 quarters of 2024, and 45% market share. Their data collection policies are roughly as broad as Kia’s, they have been accused of clandestine and deceitful practices, and they have been the subject of numerous cyberattacks.
#22 Zoom
Zoom has the dubious distinction of being among the earliest AI scandals, mere months after ChatGPT made its debut beyond the developer community. Zoom decided to be super shady with their AI rollout, quietly updating their terms of use in Mach 2023 without notifying users. In August 2023, a user stumbled across the changes which broadly expanded the types of data Zoom could collect and the purposes for which they could use it.
Consumers were furious and the FTC was also none too pleased. Zoom later issued a statement clarifying that in practice they only use such data with user consent, but the damage to trust was done. Zoom does now have some privacy settings you can toggle, but unless you’re the meeting organizer, you have no control over that and may not be in a position to object.
#23 Adobe
Not to be outdone, Adobe followed suit with its own AI scandal, also related to quietly updating its terms of service to allow access to user content through “automated and manual methods” and through use of “techniques such as machine learning” to analyze content to improve services, software, and user experiences. Designers revolted. Adobe made a blog post explaining it was all a misunderstanding.
If by “misunderstanding” Adobe means “intentional lack of transparency,” then we agree.
#24 DropBox
Same song, next verse. DropBox drew heat when it was discovered that it was allegedly sharing user files with OpenAI without consent. This was also chalked up to a misunderstanding, but it added to the growing distrust of AI companies and their models, and to distrust of Dropbox.
It costs nothing to be transparent with the people you claim are valued customers.
#25 Salesforce
Salesforce is a cloud-based customer relationship management (CRM) platform with AI driven analytics, and Slack is its native messaging app.
Salesforce collects a huge amount of business and customer data, including behavioral data, and retains that data indefinitely unless proactively deleted. Salesforce has thousands of integrations, which creates a broad data-sharing ecosystem. Salesforce also tracks users to measure engagement and conducts targeted marketing. Additionally, it uses business and customer data to train its AI. While the data is anonymized, it may still be proprietary, and anonymized data could still have value in the hands of a competitor.
Meanwhile, Slack has myriad features that allow employers to track and surveil employees, including their activities, conversations, and private messages.
#26 Evernote
Evernote, a popular cloud-based note taking app, allows users to create, organize and sync notes across multiple devices. It’s a pretty handy app, but it does over collect and overshare data, and with your permission may view your content. This puts the onus on users to toggle all the correct permissions and settings.
Additionally, users are cautioned to avoid using the app for sensitive data, such as student information, medical data, passwords, and network and firewall settings due to privacy and security concerns. Users are also encouraged to encrypt their notes.
#27 Expensify
Expensify is a software platform that allows users to track expenses, receipts and travel. It has extensive data collection and sharing and third-party tracking practices wholly unnecessary to providing its core services. Given the sensitivity of the information and the risk of identity theft or fraud, we don’t believe the overreach is warranted for a platform of this nature.
#28 BambooHR
BambooHR is a cloud-based human resources platform that stores data related to an individual’s employment and benefits. It’s not concerning that they collect this data, but it is concerning that they use trackers and advertising cookies, share the data with a number of third parties, and are allowed to use the data for targeted marketing, internal research, and to develop their services.
It has also been rated as having one of the most difficult to parse privacy policies.
#29 Canva
Canva is a popular graphic design platform. Canva gathers a wide range of information, including user-generated content, usage data, device information, and geolocation data. Canva also claims the right to share that data broadly with third parties and may use it for numerous purposes. They also use tracking tools and beacons. Canvas’ privacy policy states:
“We may analyze your activity, content, media uploads and related data in your account to provide and customize the Service, and to train our algorithms, models and AI products and services using machine learning to develop, improve and provide our Service.”
None of these things are critical to the core service.
#30 Discord
Discord is a free app for text, voice and video communication, with quite a few integrations with other apps that also collect a lot of data (e.g., Twitch, YouTube, Steam and Spotify). In addition to broad data collection policies, Discord has vague and unspecific data collection and retention practices and was called out recently by the FTC for the risk this creates for customers. Discord tracks things like location data, what servers you join, and your engagement patterns, messages are not encrypted, and Discord can access your data for moderation or analytics purposes, share with third parties, and use your data for targeted marketing.
How to Protect Yourself
Now that you’re aware of the risks and how they arise, how do you best protect yourself? First, be aware that the law almost always lags the tech. While our lawmakers play catch up, the best defense is a good offense, and that generally falls to you.
AI Safety + Best Practices
Be thoughtful about which browsers, software, and apps you use. If you lie down with dogs, you wake up with fleas.
Consider whether you need deep integrations or not (e.g., for business intelligence, CRM). If the answer is no, more privacy forward alternatives are often available.
Read terms of use and privacy policies (ugh)
Minimize the data you share, and opt out of data sharing and tracking wherever possible
Employ good security hygiene, such as using unique dynamic passwords, multifactor authentication, end-to-end encryption, and zero trust policies; keep software updated
Customize the software’s access and permissions where possible
Use privacy tools and plugins, such as VPNs, Do Not Track widgets, ad blockers, and Hide My Email
The post-information age requires a new kind of digital citizen: one that is informed, savvy and skilled in the ways of privacy and security. Once you get the hang of it, it becomes second nature. Let us help you get there.
Don’t leave your AI journey to chance
At AiGg, we understand that adopting AI isn’t just about the technology—it’s about doing so responsibly, ethically, and with a focus on protecting privacy. We’ve been through business transformations before and are here to guide you every step of the way.
Whether you’re a government agency, school district, or business, our experts—including attorneys, anthropologists, data scientists, and business leaders—can help you craft Strategic AI Use Statements that align with your goals and values. We’ll also equip you with the knowledge and tools to build your playbooks, guidelines, and guardrails as you embrace AI.
Connect with us today for your free AI Tools Adoption Checklist, Legal and Operational Issues List, and HR Handbook policy. Or, schedule a bespoke workshop to ensure your organization makes AI work safely and advantageously for you.
Your next step is simple—reach out and start your journey towards safe, strategic AI adoption with AiGg.
Live action shot of our process.