When you watch online videos, a lot of services in the background are logging personal data such as your IP-address, and also what you are watching, when, where and how. We opened this link with a clean Safari browser (no history, no cache, no cookies) and opened the Network inspection screen:

https://multimedia.europarl.europa.eu/en/video/roberta-metsola-ep-president-new-years-message_I251105

Here’s the result of sources (see screen shots at the bottom)

CloudFront (Amazon, U.S. company)
multimedia.europarl.europa.eu resolves to d1a0099pczmgnq.cloudfront.net.

Kaltura (U.S. company) and AWS (Amazon, U.S. company)
api.eu.kaltura.com resolves to 3.253.199.196 (Amazon)

CloudFront (Amazon, U.S. company)
api.multimedia.europarl.europa.eu resolves to d2v3ijbary9xcz.cloudfront.net

Kaltura (U.S. company) and AWS (Amazon, U.S. company)
api.irp2.ovp.kaltura.com resolves to 3.253.199.196 (Amazon)

Kaltura (U.S. company) and CloudFront (Amazon, U.S. company)
cfvod.irp2.ovp.kaltura.com resolves to d10c7x9xw8zetq.cloudfront.net

AWS (Amazon, U.S. company)
epwa.europarl.europa.eu resolves to 18.239.50.43 (Amazon)

Now you may ask, what does it matter that this page and video are hosted at Amazon? 

Logging

Every time you open a web page or app, and every object loaded in your browser or app is logged. When you access the link above, about 100 objects were loaded. That means your IP address was also logged about 100 times. 

This is what a video stream log line can look like, for example:

165.225.17.167 - - [08/Jan/2024:09:53:54 +0000] "GET /session/58a310651badb55b7d7adbed72a411f9/sz/Net26/wowza4/live/Infostream/media_284120120.ts HTTP/2.0" 200 759332 "https://takeoff.jetstre.am/?account=Net26&file=Infostream&type=live&service=wowza&output=player&autostart=1" "Mozilla/5.0 (SMART-TV; Linux; Tizen 6.5) AppleWebKit/537.36 (KHTML, like Gecko) SamsungBrowser/5.0 Chrome/85.0.4183.93 TV Safari/537.36" 79 0.009 [core-ams01-core-nginx-80] [] 10.233.109.221:80 759332 0.012 200 9ab548984d205cd4afb3403518924300 "-"

Respectively, this is logged: Remote IP (from the client), Username log, Datetime, Method URI, HTTP version. Status code, Response size in bytes, Referer, User agent, Request size in bytes, Response time in seconds, Internal service name, Backup internal service name, Upstream internal IP, Bytes sent, Request total time (inc delivery to viewer), Status code sent to viewer, Unique request ID for logging purposes (not traceable).

IP-address logging

As you can see, a lot of data is logged, including your IP address. Your IP address is personal data. From this it is easy to tell from which city you come from. It is also easy to determine who you are from your IP address. Suppose you create an account on 1 site, or make a purchase from 1 site. Then they know your name and address, and know how to link them to your IP address.

Profiling

Even if you then shield your IP address via, for example, a VPN, you are still easy to recognize. Because much more is logged: not only what you view, but also with what device, software, plugins, fonts, language, your screen size. The combination of this unique data makes it possible to create a unique profile of you. Try for yourself here!

Big data is big business

Parties like Facebook and Google, and many others, have made collecting this data their core business. They also link a lot of data. A big data trade has developed, because the more they learn about you, your interests and your surfing and viewing habits, the more their data sets become worth.

No consent

Without your permission and without your knowledge, they collect this information all day long. Then they use this data to make money: either by exploiting it themselves, or by selling data sets, to the highest bidder.

Political manipulation scandal

The danger of this rampant data hunger is that data can be used against you. A well-known example was the Facebook-Cambridge Analytica scandal, where data on millions of people was shared without their knowledge and consent. Then these millions of people were manipulated with political ads based on their profiles. This has led to undesirable manipulation of democratic elections in several countries. 

Data can be dangerous

So yes, you can rightly worry about who is logging what data and how it is being protected. Information about your political color, creed, religion, race, education, sexual orientation, can be disruptive and undermining, and can be used against you.

Who can you trust?

Now there is another problem at play: even if the parties logging data are in good faith, their government may not necessarily be. 

Far reaching access to data, even on EU soil

Intelligence agencies are particularly interested in the same digital data. US intelligence agencies are known to demand data from US companies and from US citizens working at digital companies. They make no secret of this: the U.S. Patriot Act and the U.S. Cloud Act establish by law that these U.S. government agencies may requisition data: even if the data are physically located in the EU, even if they are European subsidiaries. 

Less safeguards

In addition, the U.S. offers citizens and businesses far fewer safeguards when their data is collected, combined, processed, enriched, profiled, shared, and sold. The privacy laws do not meet the data processing requirements of the European GDPR legislation.

EU-exclusive

For that reason, it is strongly recommended to use only service providers that operate physically, operationally, and legally in the EU zone: physical servers and the data in the EU zone, with staff who have EU residency only, and that the company and its suppliers as well as their owners are based in the EU zone. U.S. Owned so-called sovereign cloud services fail this test, since there is still a legal U.S. tie. 

Remarkable choice

In that light, it is remarkable that the EU itself is still relatively light on using the U.S.-based Amazon: there is a risk that viewing behavior data of visitors to this official EU website could be used to profile their political preferences. We have our reservations about this.

In general, it’s much worse

Incidentally, this is one of the lightest examples we have found. Many sites work with a large list of different vendors, each again with their own mix of clouds and CDNs.