Tracking your privacy
3.88 billion people are online as of June 2017. Out of 7.51 billion people living on our planet, 51.7% is using Internet. Everything is connected. Everything will be connected. Internet is awesome, we all learn, create and share information over the Internet. The time we all spend online, especially young generation is constantly increasing. This is the true power of Internet.
But as Internet opened to all of us, we also opened us to it. To be able to utilize this “free” Internet of ours, we have a price to pay, and that price is our privacy. Our privacy is now our currency to the “free” digital trove of information on the Internet that we pay, without us even realizing it.
We leave all our private information on the Internet: birthdays, places of residence, phone numbers, email addresses, social security numbers, credit cards, GPS coordinates of our movement, our preferences and interests, not to mention our relationships, financial histories, political and religious views. The list of items goes on and all that, we give sometimes without even knowing. More often than not, this information is also copied and shared further without our knowledge. In our offline world, when someone takes something from you without your consent or without your knowledge, we call it stealing and there are laws protecting us from such endeavors.
Let me be perfectly clear, sharing data is good thing but only when we do consent with it. When we are ones in control who gets our data.
In recent years there is trend of behavioral tracking/targeting that created huge business opportunities for many companies.
How huge? Take a look at this marketing landscape with 4,891 unique companies worldwide (40MB hi-res can be downloaded here)
A bit of history #
It all started in early 1990s with web hit counters. Remember these?
We can only suspect that many web sites deploy these analytics software providers to see how people use their website or to identify broken or confusing web pages, and maybe without even realizing what these scripts are actually doing to the visitors of these web sites. The analytics companies place “scripts” on their clients’ websites that are not inspected and usually not controlled in any way by web site publisher.
Is it me? #
My curiosity to discover new things again brought me to make another investigation. Last time I presented my findings on “How secure is .rs”. This time it was the current state of trackers on Serbian popular web sites. Serbia in 2017 has estimated 7.11 million people (not counting Kosovo) in total, and 4.76 million use Internet - or 67%. Not to my surprise, again I am getting that paranoid feeling, discovering state of web tracking on Top20 Serbian web sites.
|Web site||Hosted in||3rd Party||HTTPS||PP||PII sharing declared|
- Note: kupujemprodajem.com tried registration with national registry, but due to some technical and procedural reasons this was unsuccessful. This information was provided during our internal closed meeting with Top20 media companies and we disclose this information as update to the above list.
After checking each and individual Usage and/or Privacy policies of above mentioned Top20 sites, conclusion is staggering. 15 (80%) of Top20 web sites have no declaration of sharing PII data, while only 5 have in some form. Worth noting is that even sites that have some form of policy adjusted to 3rd party sharing, are not clearly informing visitors about this practice.
Its that awful feeling that someone is watching everything you do online, without you giving consent and noticing anything. At the beginning of my assessment I thought some (not all) web sites employ only few ad/tracker servers and boy, I was wrong. Let’s take a quick look into visualization of just few popular web sites index pages, that are visited daily by millions of users:
Quote from Privacy & Usage policy of Blic Online: Blic online respects privacy of its users and visitors. Data collected by the registration process of as well as all other user data Blic Online will not be given to third party. User data will not be available to the third party unless required by law or in case of major Blic online rules violation. (local copy)
Quote from Privacy & Usage policy of Telegraf: We collect following information from user: Your domain, IP address, Browser and Operating System type, Your visit access & duration time, All pages that you access, Address of any linked web site, Any information that you provide us voluntarily.
Also we do not create any individual profiles based on information that you provide us nor we send that information to any other private organizations. Your data will be provided to government bodies only if we are obliged by law to do so. (local copy)
Quote from Data processing page: Your data will be used solely for the purpose of granting access and usage of B92 services. By accepting this rules, you agree that your data will be used for aforementioned purpose. They mention that data is used only for few pages but not index page, and not a single line on all other connections they are allowing to be made on index page. (local copy)
Note: Above graphs are from 2017/11/07, and are generated by evidon.com
On above graphs you can see where your data goes when you access either www.blic.rs, www.telegraf.rs or www.b92.net web sites. The situation is not better with any of other Top20 web sites. Each of circles represents company that does some or multiple services of the following: Ad serving, Analytics, Tracker, Widget, Privacy checks or Unknown. The arrows will show you approximation of data flow direction, and as you can see data is also shared between these companies, making sure that they “align” data to better profile you as an individual user and in same time better target you with ads. This data will be then saved indefinitely, shared between companies, and analyzed many times. Just to be clear, all this data is automatically gathered, stored and processed, there is zero human intervention in this process. With rise of machine learning and AI this data will provide even more patterns and values to companies that have them. 
Wait, what? #
- What is collected? - Everything* AD companies can get their hands on.
- 3rd party sharing? - Full aggregated data sharing with 3rd parties
- What is data retention policy? - Undisclosed or Indefinite.
- Opt out? - Almost no viable options
- Are visitors/users notified or informed? - No
*This means both PII data and Anonymous data.
How can they do this? #
Legally speaking, they can’t. Under the provisions of Serbian data protection legislation, Serbian companies are obliged to:
- Clearly inform persons about actual data collection and manipulation
- Register themselves to Serbian Data Protection Commissioner and declare what data they are collecting, to whom that data is given to or shared with
- Provide clear opt-out instructions
Main problem here is that information required by law is hidden from visitors, or visitors are blatantly miss-informed by obscure Privacy/Usage Policy texts. We have to keep in mind that aforementioned Serbian companies (Top20) are free to provide their own rules and PII data policies within given legal framework. Top20 Serbian websites choose not to comply with the simplest legal requirements.
Privacy/Usage Policy texts that we analyzed either did not mention that they are sharing user information with third parties, or they expressly stated that they are NOT sharing any PII with other organizations. Also, under Serbian Data Protection Law, personal information on Serbian citizens can not be transferred to certain countries/jurisdictions, especially not without Commissioner’s permission.
Where does our data go? #
Author compiled list of 137 companies that you will be sharing your data with, by visiting Top20 web sites in Serbia. They will also exchange data between them, when you visit other sites (bellow listing only Tracker/AdPlatform sites, in total 110. CDN, Hosting and Unknown companies excluded from the bellow list):
|Eyeota Pte Ltd||eyeota.net||N||AU||Policy|
|Casale Media Inc.||casalemedia.com||N||CA||Policy|
|Media.Net Advertising Ltd||media.net||N||CN||Policy|
|advanced store GmbH||ad4mat.de||N||DE||Policy|
|ADC Media GmbH||adc-srv.net||N||DE||Policy|
|Integral Ad Science||adsafeprotected.com||Y||DE||Policy|
|Dynamic 1001 GmbH||dyntracker.com||Y||DE||Policy|
|PE Digital GmbH||parship.de||Y||DE||Policy|
|The Reach Group GmbH||redintelligence.net||Y||DE||Policy|
|TRADEERS E-COMMERCE GMBH||tnm.de||Y||DE||Policy|
|Sedo Holding AG||webmasterplan.com||Y||DE||Policy|
|Ascio Technologies Inc.||adrtx.net||Y||DK||Policy|
|Flowplayer merged with Lemonwhale||flowplayer.org||Y||FI||Policy|
|Httpool Holdings Ltd||httpool.com||Y||HU||Policy|
|Genius Sports Group Limited||connextra.com||Y||UK||Policy|
|AddThis (acquired by Oracle)||addthis.com||Y||US||Policy|
|AddThis (acquired by Oracle)||addthisedge.com||Y||US||Policy|
|The Trade Desk||adsrvr.org||Y||US||Policy|
|Atlas (acquired by Facebook)||atdmt.com||Y||US||Policy|
|BrightRoll (acquired by Yahoo)||btrll.com||Y||US||Policy|
|AllVoices Inc. & PulsePoint Holdings, LLC||contextweb.com||Y||US||Policy|
|Lotame Solutions, Inc||crwdcntrl.net||Y||US||Policy|
|Moat, Inc. (acquired by Oracle)||moatads.com||Y||US||Policy|
|Visual IQ, Inc||myvisualiq.net||Y||US||Policy|
|Datalogix (acquired by Oracle)||nexac.com||Y||US||Policy|
|OpenX Software Ltd.||openx.net||Y||US||Policy|
|The Rubicon Project Ltd.||rubiconproject.com||Y||US||Policy|
|Full Circle Studies, Inc. / comscore||scorecardresearch.com||Y||US||Policy|
|Smart AdServer SAS||smartadserver.com||Y||US||Policy|
|FreeWheel Media, Inc.||stickyadstv.com||Y||US||Policy|
|Amobee, Inc. and Turn Inc||turn.com||Y||US||Policy|
|ConnectAd Demand GmbH||connectad.io||Y||AT||Policy|
Distribution of your data based on Top20 sites and if access is made from Serbia:
- 59% will be in US
- 21% in Germany
- 20% in all other countries
- 0% in Serbia
Just on a side note, while checking our Top20 sites in Serbia, 9 of them is using HTTPS to encrypt your access. This is welcoming change from just few years back, when this was not the case. Author of this blog hopes that in near future all Top20 sites will enable HTTPS for visitors. For the rest of the sites that use plain HTTP protocol, it means that anyone in between you and site you are visiting can see what you read or what data you submit to forms like comments or registration etc. When we say anyone, we mean your ISP, a motivated person, and government.
Breakdown of what type of companies will have access your data:
- CDN (Content Delivery Network) 6%
- Hosting companies 5%
- Various libraries/tools include sites 5%
- Tracker/AdPlatforms 79%
- Various Plugins 1%
- Unknown 4%
Note: The last Unknown percentage is for hosts that we could not determine any information behind them (Company name, Main web site, Contact details, Location, Purpose, Age, Responsible persons, History of web site, etc …). We managed to analyze some of the scripts from those Unknown servers and its mostly Potentially Unwanted Program and/or adware/malware/spyware delivery platforms. These check your browser plugin/add-on versions and try to sneak updates in outdated version for example. We also might consider a possibility that some of these unknown companies are state sponsored actors.
This is served on all Top20 Serbian web sites, delivered to millions of users every day. It is obvious that Top20 sites do not check AD sources in any capacity.
Now, if you think this is the only way AD companies can track you, think again. The AD companies have problem to link your devices, as some people use mobile phone to browse information and communicate, but use notebook or other desktop to for example make purchase or visit other sites. AD companies need these devices to be linked. The technique for cross-device tracking is then used. This technique involves inaudible, high-frequency (ultrasonic pitches) beacons that are emitted when commercial with embedded ultrasonic sounds is played on TV or radio. Your phone or tablet apps can recognize these sounds (you will not, as you can’t hear them) and software link your devices to single user ID in AD company database. This also allows the behavior of users to be tracked, including which ads were seen by the user and how long they watched an ad before changing the channel. Ars Technica published nice article under “Beware of ads that use inaudible sound to link your phone, TV, tablet, and PC” title. This process is not yet widely adopted but its chilling thought how far some companies will go to gather as much as they can about you and your privacy.
Note: Above infographic from SilverPush website.
On above list of companies operating on our Top20 list are, for example RocketFuel and Atlas that provide such services. One of companies that use Artificial intelligence algorithms as well as machine learning is AdBrain. This company is for example used thru theTradeDesk at Telenor Mobile provider web site in Serbia.
Adbrain ingests billions of raw data points from third parties such as analytics providers, media companies, dev tools and SDK providers. The data sets are then ingested into a series of advanced machine learning algorithms which classify IDs as devices, people and places. The output of the series of algorithms is a holistic map of customers and their relationship with the Devices they use, the Places they frequent, and the People they interact with.
Another way to cross-link your devices is to have your phone number. This was the real reason for Facebook to acquire WhatsApp to gather phone numbers that WhatsApp collected, and then to link your mobile device with your user on desktop and/or any other device.
Sessions replay and mouse movement is another way to fully analyze visitor onsite behavior. This is somewhat new technology superseding old click heat maps used few years back. By using special scripts from AD companies, it’s possible to record full “session” of user interacting with web page, including mouse movements and keystrokes while preserving session for later replay. This is how it works:
Research on above topic was recently done and published by Princeton University, and more information as well as full list of companies using this technologies, can be found here. Wired also run the story named “The Dark Side of ‘Replay Sessions’ That Record Your Every Move Online” detailing information published in this research.
You might think all this is a joke but, I will remind you that US NSA and UK GCHQ agencies are piggybacking on these commercial technologies to identify people browsing the Internet. More details on this was published in The Washington Post in 2013 under “NSA uses Google cookies to pinpoint targets for hacking” title. These revelations came to light after Edward Snowden provided insights to general public.
Now with those who say “I am not a target of those agencies …”, you don’t have to be. It just takes ~$1000 to track any person location and movement by using targeted mobile ads. Wired published awesome article on that subject. Also Paul G. Allen School of computer science and engineering, University of Washington published very detailed paper on this topic “Exploring ADINT: Using Ad Targeting for Surveillance on a Budget” (pdf).
With all this business behind personal data you are generating, its only matter of time before your ISP or Telecom/Mobile provider decide to make profit out of it. In US and parts of EU this is already a trend. Mobile phone usage is tracked and monetized by Verizon, Sprint, Telefonica (as well as other global carriers) partnering with firms including SAP, IBM, HP, and AirSage that manage and package different levels of the collected data. By mining the data produced by cellphone users, the carriers can tie together location, mobile browser usage and call information into one data stream. Nice article about this topic was published by InformationWeek titled “Mobile Carriers Cashing In On Mining Your Data”
Currently we have no information if Serbian ISP and/or Telecom operators sell data or exchange data with any AD companies, but we do know that they collect data per information provided from National Registry (see Table 3).
|Company Name||HTTPS||3rd party||Registry||PP||PII sharing declared|
|Telekom Srbija||Y||19||Registry||Policy||N (Court, DA, Authorities only)|
|Radijus Vektor||Y||5||Registry||No online info.||N|
|BeotelNet||Y||15||Registry||Policy||N (Court, DA, Authorities only)|
|Targo Telekom||N||12||Registry||Policy||N (Court, DA, Authorities only)|
|Telenor||Y||37||Registry||Policy||N (Court, DA, Authorities only)|
|VIP||Y||12||Registry||Policy||N (Court, DA, Authorities only)|
Only ISP/Carrier company from above list, that published option to sell data to 3rd party, per Registry is Orion Telekom as its stated that user personal details, online identity as well as his activity could be given to third party.
Remember, we pay our monthly Internet bill either your Carrier or ISP to be able to access the Internet. We don’t pay it to give them a chance to collect and sell our private data to make more money.
Legislation & Regulation #
This tracking thing caught so much speed, that some serious legislation and oversight is urgently needed. This was recognized by EU lawmakers and this was incorporated into “Regulation (EU) 2016/679 of the European Parliament and of the Council - Directive 95/46/EC that will be superseded by (General Data Protection Regulation or in short GDPR)”, and as such it will be enforced by EU to all European countries from 25 May 2018.
From GDPR: “Persons may be associated with online identifiers, such as internet protocol addresses, cookie identifiers or other identifiers. This may leave traces which, in particular when combined with unique identifiers and other information received by the servers, may be used to create profiles of the persons and identify them.”
What this essentially tells us it that cookies, where they are used to uniquely identify the device, or in combination with other data, the individual associated with or using the device, should be treated as personal data. Under the GDPR, any cookie or other identifier, uniquely attributed to a device and therefore capable of identifying an individual, or treating them as unique even without identifying them, is personal data. There are two important terms: Personally Identifiable Information (PII), and Sensitive Personal Information (SPI). Following data is classified as PII:
- Full name (if not common)
- Home address
- Email address
- National identification number
- Passport number
- IP address
- Vehicle registration plate number
- Driver’s license number
- Face, fingerprints, or handwriting
- Credit card numbers
- Digital identity
- Date of birth
- Genetic information
- Telephone number
- Login name, screen name, nickname, or handle
- First or last name, if common
- Country, state, postcode or city of residence
- Age, especially if non-specific
- Gender or race
- Name of the attend(ed) school or workplace
- Grades, salary, or job position
- Criminal record
- Web cookie
In Article 42 of the Serbian Constitution is written:
Protection of personal data shall be guaranteed.
Collecting, keeping, processing and using of personal data shall be regulated by the law.
Use of personal data for any the purpose other the one were collected for shall be prohibited and punishable in accordance with the law, unless this is necessary to conduct criminal proceedings or protect safety of the Republic of Serbia, in a manner stipulated by the law.
Everyone shall have the right to be informed about personal data collected about him, in accordance with the law, and the right to court protection in case of their abuse.
Serbian special “Personal Data Protection Law” defines personal information as any information pertaining to a natural person, regardless of the form in which it is expressed and on the information carrier (paper, tape, film, electronic media, etc.), on whose order, or on whose behalf the is the information stored, the date of the creation of the information, the location of where information is being stored, the method of finding information (directly, through listening, viewing, etc., or indirectly, by inspecting the document in which the information is contained, etc.), or regardless of the other information property (hereinafter: data). A physical person is defined as the person to whom the data relates, whose identity is determined or can be determined on the basis of a personal name, unique citizen identity, address code or other characteristic of his physical, psychological, spiritual, economic, cultural or social identity.
Also, Personal Data Protection Law, under article 14 states following:
Data shall be collected from data subjects and from public authorities authorized under the law to collect such data.
Data may also be collected from third parties if:
1) envisaged by a contract concluded with a data subject;
2) envisaged by a law or another regulation passed pursuant to a law;
3) necessary taking into account the nature of the task;
4) data collection from a data subject is time-consuming or requires
disproportionately high resources;
5) data are collected for the purpose of achieving or protecting vital interests of a data subject, in particular his/her life, health and physical integrity.
Serbian Personal Data Protection Law, under article 15 stipulates the following obligations also:
The entity who collects data from data subjects or from third parties shall, before data collection, inform the data subject or the third party of:
1) His/her identity, i.e. name and address or business name, or the identity of another person responsible for data processing under the law;
2) The purpose of data collection and subsequent processing;
3) The manner in which data will be used;
4) The identity or categories of persons who will use data;
5) The mandatory nature and legal grounds or else the voluntary nature of data provision and processing;
6) The right to withdraw one’s consent to processing and the legal consequences in the event of withdrawal;
7) The data subject’s rights in case of unlawful processing;
8) Other circumstances the withholding of which from a data subject or a third party would be contrary to conscientious treatment.
How your data is used? #
We collect PII via this website when it is provided to us. The PII we collect may include name, address, phone number, email address, or any other information provided to us. We use the PII collected via this website for business purposes, such as marketing, recruiting and hiring, responding to general inquiries or customer support requests, and in the course of providing services to customers. We retain the PII collected via the website indefinitely, unless otherwise specified, or until we delete the PII pursuant to a request from the subject of the PII.
Basically AD company will accept any data they can get their hands on, process it any way they see fit, and store it indefinitely.
Princeton’s WebTAP privacy project recently found that Google’s trackers are installed on 75% of the top million internet websites. (The next closest is Facebook at 25%).
Google sells ads not only on their search engine, but also on over >2.2 million other websites and >1 million apps. Every time you visit one of these sites or apps, Google is storing that information and using it to target ads at you. That’s why you may have seen ads tracking you (same ads on multiple sites you visited) across the whole Internet.
How do you opt out? #
In short, you can’t at least there is no viable option atm. Yes, you have the option to sign out or opt out on some of web sites.. If you follow procedure to opt out, you need to go to AD company web site special URL and submit that you like to opt out yourself. What this will actually do, you will no longer be tracked by this particular company (and we need to trust them on that one) but you will still see content they provide to web sites you visit, only this time not as targeted as before. This relies on cookie that is installed in your browser, as soon as you re-install your computer, browser, or change device, you are again on the list. The only current option you have is to use
add-ons for your browsers and completely block AD content, for more information check bellow.
How your online day looks like? #
With all devices that are connected to Internet, and as we use them all during our daily life. When I got up, I opened few regular news sites that I read. The circles represent web sites, while triangles represent 3rd party web sites that shared my data with.
By end of the day, I was tracked by more than 400 unique companies only on a single device:
What online profile looks like for your kids? What would happen if you knew that hundreds of companies tracking and targeting your kids? Imagine scenario that you kids are going on the street and there are hundreds of people following them, marking down everything they are doing while they are outside. Would you call for the police?
How to protect your data #
Educating yourself and your friends is the first step. Next step is to support legal ways to control how this data is used. Check out EFF web site, especially EFF Behavioral Tracking segment. In Serbia monitor and support work of Commissioner for Information of Public Importance and Personal Data Protection office. Also make sure you visit and read work published by Share Foundation as well as Share Foundation Labs project.
There is no ideal solution, but there are alternatives. I will briefly mention add-ons that you can use, namely:
Use alternatives to major search players:
To completely bypass your ISP, use VPN, and to get better annonymity use TOR.
For temporary email use Mailinator (fake online temporary email generator) and Bugmenot when you need to quickly access some restricted web site without leaving your email. You could also follow some of recommendations that I have wrote in one of previous blog posts “Gerzićev vodič za prosečnog Internet korisnika”.
Advertising IDs (AAID/IDFA) on Andrioid
From the available apps list locate the Google Settings → the Ads → Reset advertising ID. On the Reset advertising ID confirmation box, tap OK to provide your consent to reset your Google advertising ID. Once done, your new Google advertising ID would appear at the bottom of the interface.
Interest Based Ads on Android
Open the Google Settings app → Ads and enable Opt out of Interest Based Ads.
Advertising IDs on iPhone/iPad
Open Settings app → Privacy → Advertising → Reset Advertising Identifier.
Limit Ad Tracking iOS
Launch Settings app on your iPhone → Next up → Privacy → Advertising.
Toggle ON the switch next to Limit Ad Tracking. Now, your Apple ID will be removed from the list of receiving targeted ads.
Location Based Ad Tracking in iOS 10
Open Settings app → Privacy → Location Services → System Services.
Toggle off the switch next to Location-Based iAds.
Conclusion & TL;DR #
So why this would be important to you? For example, search engines (i.e. Google) will set search bubble tailored for you (means you will see only search results that search engine algorithms thinks you should see and not what is actually something you would like) and will also show you personalized targeted ads. Some retailers will increase pricing of their products or services based on your online activities. HR company can utilize your online profile to decide if you will be called to an interview for a job you applied for. In near future, your credit or insurance score might be based on your online activities.
All this is being done without your (legally required) consent. Information on how your PII is being handled is deliberately obscured or hidden from you. Never mind the Directive, Constitution and the Law.
Do you still believe that your online activity does not influence your off-line life? Do you still think that you do not have anything to hide online? Btw we are just getting started, how about whole country political or religious views? You might like to read paper “Psychological targeting as an effective approach to digital mass persuasion”
This research publication will receive additional details with ongoing research and we will publish relevant information in upcoming weeks.
[2018/01/16] We just noticed that blic.rs is now on HTTPS, good news :)
[2017/12/14] Today we held closed session with Top media representatives regarding our research with direct support from Commissioner for Information of Public Importance and Personal Data Protection office. Published research results were discussed along with possible solutions and ways how we can improve visitors privacy awareness. The meeting was organized and supported by ShareDefence organization and it was held at Parobrod location.
The list of companies that provided policies is also updated. Please check table 1.
[2017/12/12] We noticed some changes, mainly B92 is now on HTTPS (no HSTS yet tho), and that is cool! We do hope that the rest of the Top20 web sites will also switch to HTTPS by default for all visitors. Also some web sites reduced number of trackers, these are great news! :) We did managed to make some change already! Looking forward to more good news :)
[2017/12/11] Today I gave an interview for Belgrade Radio2 on this and some other topics. I had pleasure talking with Tamara Vucenovic This will be on air in few weeks. I will link the online version for listening here when it’s published.
[2017/12/02] Published list of all Top20 sites request maps with mapping visible ad trackers. This can be seen in table 1 by clicking on 3rd party value of each web site. Tests are done from UK, London and with Chrome simulated browsers activities. Total number of trackers per each web site tested vary based on location you are accessing web site, browser you a using and other factors, it’s dynamic! Visual representation was made possible thanks to Simon Hearne @simonhearne
[2017/12/01] We held short lecture with open discussion at Faculty of Organizational sciences. These are the slides used.
Thank you #
Author would like to acknowledge help received from Žarko Ptiček for correcting, translating and publishing this text on Serbian language and also Commissioner for Information of Public Importance and Personal Data Protection office for legal guidance and assistance. Thanks to Simon Hearne @simonhearne for making his tools public! For helping with coordination and organization of Top media representatives closed meeting held on 2017/12/14 we would like to express our gratitude to ShareDefence organization.
All original material from this blog post may be freely distributed at will under the Creative Commons Attribution License, unless otherwise noted. All material that is not original to this blog post may require permission from the copyright holder to redistribute.