[go: nahoru, domu]

Jump to content

History sniffing: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
m Adding short description: "Class of attacks tracking web browser history"
m Add good article icon
 
(44 intermediate revisions by 19 users not shown)
Line 1: Line 1:
{{good article}}
{{Good article}}
{{Short description|Class of attacks tracking web browser history}}
{{Short description|Class of attacks tracking web browser history}}
{{Use dmy dates|date=February 2024}}
'''History sniffing''' is a class of web vulnerabilities and attacks that allow a website to track a user's [[web browsing history]] activities by recording which websites a user has visited and which the user has not.
'''History sniffing''' is a class of web vulnerabilities and attacks that allow a website to track a user's [[web browsing history]] activities by recording which websites a user has visited and which the user has not. This is done by leveraging long-standing [[information leakage]] issues inherent to the design of the web platform, one of the most well-known of which includes detecting [[CSS]] attribute changes in links that the user has already visited.

Despite being known about since 2002, history sniffing is still considered an unsolved problem. In 2010, researchers revealed that multiple high-profile websites had used history sniffing to identify and track users. Shortly afterwards, [[Mozilla]] and all other major web browsers implemented defences against history sniffing. However, recent research has shown that these mitigations are ineffective against specific variants of the attack and history sniffing can still occur via visited links and newer browser features.


== Background ==
== Background ==
{{Main|History of the World Wide Web|Web security|Same-origin policy}}
{{Main|History of the World Wide Web|Web security|Same-origin policy}}
Early [[Web browser|browsers]] such as Mosaic and [[Netscape Navigator]] were built on the model of the web being a set of statically linked documents known as pages. In this model, it made sense for the user to know which documents they had previously visited and which they hadn't, regardless of which document was referring to it.<ref>{{Cite web |title=WorldWideWeb: Proposal for a HyperText Project |url=https://www.w3.org/Proposal.html |access-date=2023-11-15 |website=www.w3.org}}</ref> [[Mosaic (web browser)|Mosaic]], one of the earliest graphical web browser, would use purple links to show that a page has been visited and blue [[Hyperlink|links]] to show pages that had not been visited.<ref>{{Cite web |title=Why are hyperlinks blue? {{!}} The Mozilla Blog |url=https://blog.mozilla.org/en/internet-culture/deep-dives/why-are-hyperlinks-blue/ |access-date=2023-11-15 |website=blog.mozilla.org |language=en-US}}</ref><ref>{{Cite web |title=EMail Msg |url=https://ksi.cpsc.ucalgary.ca/archives/WWW-TALK/www-talk-1993q2.messages/47.html |access-date=2023-11-15 |website=ksi.cpsc.ucalgary.ca}}</ref> This paradigm stuck around and was subsequently adopted by all modern web browsers.<ref name=":0">{{Cite book |last1=Weinberg |first1=Zachary |url=https://ieeexplore.ieee.org/document/5958027 |title=2011 IEEE Symposium on Security and Privacy |last2=Chen |first2=Eric Y. |last3=Jayaraman |first3=Pavithra Ramesh |last4=Jackson |first4=Collin |date=2011 |publisher=IEEE |isbn=978-1-4577-0147-4 |pages=147–161 |language=en-US |chapter=I Still Know What You Visited Last Summer: Leaking Browsing History via User Interaction and Side Channel Attacks |doi=10.1109/SP.2011.23 |access-date=2023-10-30 |s2cid=10662023}}</ref>
Early [[Web browser|browsers]] such as [[Mosaic (web browser)|Mosaic]] and [[Netscape Navigator]] were built on the model of the web being a set of [[Static web page|statically]] linked documents known as pages. In this model, it made sense for the user to know which documents they had previously visited and which they hadn't, regardless of which document was referring to them.<ref>{{Cite web |title=WorldWideWeb: Proposal for a HyperText Project |url=https://www.w3.org/Proposal.html |access-date=15 November 2023 |website=www.w3.org |archive-date=29 June 2023 |archive-url=https://web.archive.org/web/20230629071324/https://www.w3.org/Proposal.html |url-status=live }}</ref> Mosaic, one of the earliest graphical web browsers, used purple links to show that a page had been visited and blue [[Hyperlink|links]] to show pages that had not been visited.<ref>{{Cite web |title=Why are hyperlinks blue? {{!}} The Mozilla Blog |url=https://blog.mozilla.org/en/internet-culture/deep-dives/why-are-hyperlinks-blue/ |access-date=15 November 2023 |website=blog.mozilla.org |language=en-US |archive-date=15 November 2023 |archive-url=https://web.archive.org/web/20231115192843/https://blog.mozilla.org/en/internet-culture/deep-dives/why-are-hyperlinks-blue/ |url-status=live }}</ref><ref>{{Cite web |title=EMail Msg |url=https://ksi.cpsc.ucalgary.ca/archives/WWW-TALK/www-talk-1993q2.messages/47.html |access-date=15 November 2023 |website=ksi.cpsc.ucalgary.ca |archive-date=15 November 2023 |archive-url=https://web.archive.org/web/20231115192842/https://ksi.cpsc.ucalgary.ca/archives/WWW-TALK/www-talk-1993q2.messages/47.html |url-status=live }}</ref> This paradigm stuck around and was subsequently adopted by all modern web browsers.<ref name=":0">{{Cite book |last1=Weinberg |first1=Zachary |url=https://ieeexplore.ieee.org/document/5958027 |title=2011 IEEE Symposium on Security and Privacy |last2=Chen |first2=Eric Y. |last3=Jayaraman |first3=Pavithra Ramesh |last4=Jackson |first4=Collin |date=2011 |publisher=IEEE |isbn=978-1-4577-0147-4 |pages=147–161 |language=en-US |chapter=I Still Know What You Visited Last Summer: Leaking Browsing History via User Interaction and Side Channel Attacks |doi=10.1109/SP.2011.23 |access-date=30 October 2023 |s2cid=10662023 |archive-date=24 December 2022 |archive-url=https://web.archive.org/web/20221224194909/http://ieeexplore.ieee.org/document/5958027/ |url-status=live }}</ref>


Over the years, the web evolved from its original model of static content towards favouring more dynamic content. In 1995, employees at [[Netscape]] added a scripting language, [[JavaScript|Javascript]], to its flagship web browser, Netscape Navigator. This addition allowed users to add interactivity to the web page via executing Javascript programs as part of the [[Rendering (computer graphics)|rendering process]].<ref>{{Cite web|url=https://www.webdesignmuseum.org/web-design-history/javascript-1-0-1995|title=JavaScript 1.0 - 1995|website=www.webdesignmuseum.org|language=en|access-date=2020-01-19}}</ref><ref>{{Cite web|url=http://home.netscape.com/eng/mozilla/2.0/relnotes/windows-2.0.html|title=Welcome to Netscape Navigator Version 2.0|date=1997-06-14|website=netscape.com|url-status=dead|archive-url=https://web.archive.org/web/19970614000538/http://home.netscape.com/eng/mozilla/2.0/relnotes/windows-2.0.html|archive-date=1997-06-14|access-date=2020-02-16}}</ref> However, this addition came with a new security problem, that of these Javascript programs being able to access each other's [[Execution (computing)|execution context]] and being able to gain access to sensitive information about the user. As a result, shortly afterwards, Netspace Navigator introduced the [[same-origin policy]]. This security measure prevented Javascript from being able to arbitrarily access data in a different web page's execution context.<ref>{{Cite web|url=http://wp.netscape.com/eng/mozilla/3.0/handbook/javascript/advtopic.htm#1009533|title=Netscape 3.0 Handbook - Advanced topics|website=netscape.com|url-status=dead|archive-url=https://web.archive.org/web/20020808153106/http://wp.netscape.com:80/eng/mozilla/3.0/handbook/javascript/advtopic.htm#1009533|archive-date=2002-08-08|access-date=2020-02-16|quote=Navigator version 2.02 and later automatically prevents scripts on one server from accessing properties of documents on a different server.}}</ref> However, while the same-origin policy was subsequently extended to cover a large variety of features introduced before it's existence, it was never extended to cover the hyperlinks since it was perceived to hurt the user's ability to browse the web.<ref name=":0" /> This innocuous omission would manifest into one of the well known and earliest forms of history sniffing known on the web.<ref name=":1">{{Cite journal |last=Van Goethem |first=Tom |last2=Joosen |first2=Wouter |last3=Nikiforakis |first3=Nick |date=2015-10-12 |title=The Clock is Still Ticking: Timing Attacks in the Modern Web |url=https://doi.org/10.1145/2810103.2813632 |journal=Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security |series=CCS '15 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=1382–1393 |doi=10.1145/2810103.2813632 |isbn=978-1-4503-3832-5}}</ref>
Over the years, the web evolved from its original model of static content towards more dynamic content. In 1995, employees at [[Netscape]] added a scripting language, [[JavaScript|Javascript]], to its flagship web browser, Netscape Navigator. This addition allowed users to add interactivity to the web page via executing Javascript programs as part of the [[Rendering (computer graphics)|rendering process]].<ref>{{Cite web|url=https://www.webdesignmuseum.org/web-design-history/javascript-1-0-1995|title=JavaScript 1.0 1995|website=www.webdesignmuseum.org|language=en|access-date=19 January 2020|archive-date=7 August 2020|archive-url=https://web.archive.org/web/20200807073237/https://www.webdesignmuseum.org/web-design-history/javascript-1-0-1995|url-status=live}}</ref><ref>{{Cite web|url=http://home.netscape.com/eng/mozilla/2.0/relnotes/windows-2.0.html|title=Welcome to Netscape Navigator Version 2.0|date=14 June 1997|website=netscape.com|url-status=dead|archive-url=https://web.archive.org/web/19970614000538/http://home.netscape.com/eng/mozilla/2.0/relnotes/windows-2.0.html|archive-date=14 June 1997|access-date=16 February 2020}}</ref> However, this addition came with a new security problem, that of these Javascript programs being able to access each other's [[Execution (computing)|execution context]] and sensitive information about the user. As a result, shortly afterwards, Netscape Navigator introduced the [[same-origin policy]]. This security measure prevented Javascript from being able to arbitrarily access data in a different web page's execution context.<ref>{{Cite web|url=http://wp.netscape.com/eng/mozilla/3.0/handbook/javascript/advtopic.htm#1009533|title=Netscape 3.0 Handbook Advanced topics|website=netscape.com|url-status=dead|archive-url=https://web.archive.org/web/20020808153106/http://wp.netscape.com:80/eng/mozilla/3.0/handbook/javascript/advtopic.htm#1009533|archive-date=8 August 2002|access-date=16 February 2020|quote=Navigator version 2.02 and later automatically prevents scripts on one server from accessing properties of documents on a different server.}}</ref> However, while the same-origin policy was subsequently extended to cover a large variety of features introduced before its existence, it was never extended to cover hyperlinks since it was perceived to hurt the user's ability to browse the web.<ref name=":0" /> This innocuous omission would manifest into one of the well known and earliest forms of history sniffing known on the web.<ref name=":1">{{Cite book |last1=Van Goethem |first1=Tom |last2=Joosen |first2=Wouter |last3=Nikiforakis |first3=Nick |chapter=The Clock is Still Ticking: Timing Attacks in the Modern Web |date=12 October 2015 |title=Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security |chapter-url=https://doi.org/10.1145/2810103.2813632 |series=CCS '15 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=1382–1393 |doi=10.1145/2810103.2813632 |isbn=978-1-4503-3832-5|s2cid=17705638 |url=https://lirias.kuleuven.be/handle/123456789/541727 }}</ref>


== History ==
== History ==
[[File:Visited links vs unvisited links color difference.png|thumb|alt=Text with two links, one titled leukemia is purple, the other is not.|By extracting the colour of certain links, a website can access personally identifiable information. In this example, the website could infer that the user might be interested in [[leukemia]], a form of blood cancer.]]
One of the first publicly disclosed reports of a history sniffing exploit was made by Andrew Clover from [[Purdue University]] in a mailing list post on [[Bugtraq|BUGTRAQ]] in 2002. The post detailed how a malicious website could use Javascript to determine if a given link was of a specific colour, thus revealing if the link had been previously visited.<ref>{{Cite web |title=Bugtraq: CSS visited pages disclosure |url=https://seclists.org/bugtraq/2002/Feb/271 |access-date=2023-11-16 |website=seclists.org |language=en}}</ref> While this was initially thought of to be a theoretical exploit with little real-world value, later research by Jang et al. in 2010 revealed that much high profile website were using this technique in the wild to reveal user browsing data.<ref>{{Cite journal |last=Jang |first=Dongseok |last2=Jhala |first2=Ranjit |last3=Lerner |first3=Sorin |last4=Shacham |first4=Hovav |date=2010-10-04 |title=An empirical study of privacy-violating information flows in JavaScript web applications |url=https://doi.org/10.1145/1866307.1866339 |journal=Proceedings of the 17th ACM conference on Computer and communications security |series=CCS '10 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=270–283 |doi=10.1145/1866307.1866339 |isbn=978-1-4503-0245-6}}</ref> As a result of the publication of this research multiple lawsuits were filed against the websites that were found to have used history sniffing alleging a violation of the [[Computer Fraud and Abuse Act]] of 1986.<ref name=":1" />
One of the first publicly disclosed reports of a history sniffing exploit was made by Andrew Clover from [[Purdue University]] in a mailing list post on [[Bugtraq|BUGTRAQ]] in 2002. The post detailed how a malicious website could use Javascript to determine if a given link was of a specific colour, thus revealing if the link had been previously visited.<ref>{{Cite web |title=Bugtraq: CSS visited pages disclosure |url=https://seclists.org/bugtraq/2002/Feb/271 |access-date=16 November 2023 |website=seclists.org |language=en |archive-date=16 November 2023 |archive-url=https://web.archive.org/web/20231116005936/https://seclists.org/bugtraq/2002/Feb/271 |url-status=live }}</ref> While this was initially thought of to be a theoretical exploit with little real-world value, later research by ''Jang et al.'' in 2010 revealed that high-profile websites were using this technique in the wild to reveal user browsing data.<ref>{{Cite book |last1=Jang |first1=Dongseok |last2=Jhala |first2=Ranjit |last3=Lerner |first3=Sorin |last4=Shacham |first4=Hovav |chapter=An empirical study of privacy-violating information flows in JavaScript web applications |date=4 October 2010 |title=Proceedings of the 17th ACM conference on Computer and communications security |chapter-url=https://doi.org/10.1145/1866307.1866339 |series=CCS '10 |location=New York, NY, USA |publisher=Association for Computing Machinery |pages=270–283 |doi=10.1145/1866307.1866339 |isbn=978-1-4503-0245-6|s2cid=10901628 }}</ref> As a result multiple lawsuits were filed against the websites that were found to have used history sniffing alleging a violation of the [[Computer Fraud and Abuse Act|Computer Fraud and Abuse Act of 1986]].<ref name=":1" />


In the same year, L David Baron from [[Mozilla Corporation]] developed a defence against the attack that all major browsers would later adopt. The defences included restrictions against what kinds of [[CSS]] attributes could be used to style visited links. The ability to add background images and CSS transitions to links was disallowed. In addition to this, visited links would be treated identically to standard links, with Javascript APIs returning the same attributes for a visited link as those for non-visited links. This ensured that malicious websites could not simply infer a person's browsing history by querying the colour changes.<ref>{{Cite web |title=privacy-related changes coming to CSS:visited – Mozilla Hacks - the Web developer blog |url=https://hacks.mozilla.org/2010/03/privacy-related-changes-coming-to-css-vistited |access-date=2023-11-16 |website=Mozilla Hacks – the Web developer blog |language=en-US}}</ref>
In the same year, L. David Baron from [[Mozilla Corporation]] developed a defence against the attack that all major browsers would later adopt. The defence included restrictions against what kinds of [[CSS]] attributes could be used to style visited links. The ability to add background images and CSS transitions to links was disallowed. Additionally, visited links would be treated identically to standard links, with Javascript [[API|application programming interfaces]] (APIs) that allow the website to query the color of specific elements returning the same attributes for a visited link as those for non-visited links. This ensured malicious websites could not simply infer a person's browsing history by querying the colour changes.<ref>{{Cite web |title=privacy-related changes coming to CSS:visited – Mozilla Hacks the Web developer blog |url=https://hacks.mozilla.org/2010/03/privacy-related-changes-coming-to-css-vistited |access-date=16 November 2023 |website=Mozilla Hacks – the Web developer blog |language=en-US |archive-date=7 June 2023 |archive-url=https://web.archive.org/web/20230607154747/https://hacks.mozilla.org/2010/03/privacy-related-changes-coming-to-css-vistited/ |url-status=live }}</ref>


In 2011, research by then [[Stanford University|Stanford]] graduate student [[Jonathan Mayer|Jonathan Mayers]] found that a advertising company Epic Marketplace Inc. had used history sniffing to collect information about the browsing history of users across the web.<ref>{{Cite web |title=Tracking the Trackers: To Catch a History Thief |url=https://cyberlaw.stanford.edu/blog/2011/07/tracking-trackers-catch-history-thief |access-date=2023-11-16 |website=cyberlaw.stanford.edu |language=en}}</ref><ref>{{Cite web |last=Goodin |first=Dan |title=Marketer taps browser flaw to see if you're pregnant |url=https://www.theregister.com/2011/07/22/marketer_sniffs_browser_history/ |access-date=2023-11-16 |website=www.theregister.com |language=en}}</ref> As a part of a subsequent investigation by the [[Federal Trade Commission|Federal Trade Commision]], it was revealed that Epic Marketplace had used history sniffing code as a part of advertisments in over 24000 web domains, such as the likes of [[ESPN]], [[Papa John's|Papa Johns]] etc. The Javascript code allowed Epic Markteplace Inc. to track if a user has visited any of over 54000 domains.<ref>{{Cite web |title=FTC Final Order Prohibits Epic Marketplace From "History Sniffing" |url=https://www.jdsupra.com/legalnews/ftc-final-order-prohibits-epic-marketpla-06062/ |access-date=2023-11-16 |website=JD Supra |language=en}}</ref><ref name=":2">{{Cite web |date=2012-12-05 |title=FTC Settlement Puts an End to "History Sniffing" by Online Advertising Network Charged With Deceptively Gathering Data on Consumers |url=https://www.ftc.gov/news-events/news/press-releases/2012/12/ftc-settlement-puts-end-history-sniffing-online-advertising-network-charged-deceptively-gathering |access-date=2023-11-16 |website=Federal Trade Commission |language=en}}</ref> The resulting data was subsequently used by Epic Marketplace to categorize users into specific groups and serve advertisements based on the websites the user had visited. As a result of this investigation, the Federal Trade Commission banned Epic Marketplace Inc. from conducting any form of online advertising, marketing, etc, for over twenty years. In addition to this, Epic Marketplace Inc. was ordered to permanently delete and destroy the data it had collected over the years of users' browsing data.<ref>{{Cite web |last=Gross |first=Grant |date=2012-12-05 |title=US FTC bars advertising firm from sniffing browser histories |url=https://www.computerworld.com/article/2716495/us-ftc-bars-advertising-firm-from-sniffing-browser-histories.html |access-date=2023-11-16 |website=Computerworld |language=en}}</ref><ref name=":2" />
In 2011, research by then-[[Stanford University|Stanford]] graduate student [[Jonathan Mayer]] found that advertising company Epic Marketplace Inc. had used history sniffing to collect information about the browsing history of users across the web.<ref>{{Cite web |title=Tracking the Trackers: To Catch a History Thief |url=https://cyberlaw.stanford.edu/blog/2011/07/tracking-trackers-catch-history-thief |access-date=16 November 2023 |website=cyberlaw.stanford.edu |language=en |archive-date=16 November 2023 |archive-url=https://web.archive.org/web/20231116005936/https://cyberlaw.stanford.edu/blog/2011/07/tracking-trackers-catch-history-thief |url-status=live }}</ref><ref>{{Cite web |last=Goodin |first=Dan |title=Marketer taps browser flaw to see if you're pregnant |url=https://www.theregister.com/2011/07/22/marketer_sniffs_browser_history/ |access-date=16 November 2023 |website=www.theregister.com |language=en |archive-date=16 November 2023 |archive-url=https://web.archive.org/web/20231116005936/https://www.theregister.com/2011/07/22/marketer_sniffs_browser_history/ |url-status=live }}</ref> A subsequent investigation by the [[Federal Trade Commission]] (FTC) revealed that Epic Marketplace had used history sniffing code as a part of advertisements in over 24,000 web domains, including [[ESPN]] and [[Papa John's|Papa Johns]]. The Javascript code allowed Epic Marketplace to track if a user has visited any of over 54,000 domains.<ref>{{Cite web |title=FTC Final Order Prohibits Epic Marketplace From "History Sniffing" |url=https://www.jdsupra.com/legalnews/ftc-final-order-prohibits-epic-marketpla-06062/ |access-date=16 November 2023 |website=JD Supra |language=en |archive-date=16 November 2023 |archive-url=https://web.archive.org/web/20231116005936/https://www.jdsupra.com/legalnews/ftc-final-order-prohibits-epic-marketpla-06062/ |url-status=live }}</ref><ref name=":2">{{Cite web |date=5 December 2012 |title=FTC Settlement Puts an End to "History Sniffing" by Online Advertising Network Charged With Deceptively Gathering Data on Consumers |url=https://www.ftc.gov/news-events/news/press-releases/2012/12/ftc-settlement-puts-end-history-sniffing-online-advertising-network-charged-deceptively-gathering |access-date=16 November 2023 |website=Federal Trade Commission |language=en |archive-date=16 November 2023 |archive-url=https://web.archive.org/web/20231116005936/https://www.ftc.gov/news-events/news/press-releases/2012/12/ftc-settlement-puts-end-history-sniffing-online-advertising-network-charged-deceptively-gathering |url-status=live }}</ref> The resulting data was subsequently used by Epic Marketplace to categorize users into specific groups and serve advertisements based on the websites the user had visited. As a result of this investigation, the FTC banned Epic Marketplace Inc. from conducting any form of online advertising and marketing for twenty years and was ordered to permanently delete the data it had collected.<ref>{{Cite web |last=Gross |first=Grant |date=5 December 2012 |title=US FTC bars advertising firm from sniffing browser histories |url=https://www.computerworld.com/article/2716495/us-ftc-bars-advertising-firm-from-sniffing-browser-histories.html |access-date=16 November 2023 |website=Computerworld |language=en |archive-date=16 November 2023 |archive-url=https://web.archive.org/web/20231116005936/https://www.computerworld.com/article/2716495/us-ftc-bars-advertising-firm-from-sniffing-browser-histories.html |url-status=live }}</ref><ref name=":2" />


== Threat model ==
== Threat model ==
The [[threat model]] of history sniffing relies on the adversary being able to direct the victim to a malicious website entirely or partially under the adversary's control. The adversary can accomplish this by compromising a previously good web page, by [[phishing]] the user to a web page allowing the adversary to load arbitrary code, or by using a malicious advertisement on an otherwise safe web page.<ref name=":1" /><ref>{{Cite journal |last=Sanchez-Rola |first=Iskander |last2=Balzarotti |first2=Davide |last3=Santos |first3=Igor |date=2020-12-22 |title=Cookies from the Past: Timing Server-side Request Processing Code for History Sniffing |url=https://dl.acm.org/doi/10.1145/3419473 |journal=Digital Threats: Research and Practice |volume=1 |issue=4 |pages=24:1–24:24 |doi=10.1145/3419473}}</ref> While most history sniffing attacks do not require user interactions, specific variants of the attacks need users to interact with particular elements which can often be disguised in the form of buttons, browser games, [[CAPTCHA|CAPTCHA's]] etc.<ref name=":0" />
The [[threat model]] of history sniffing relies on the adversary being able to direct the victim to a malicious website entirely or partially under the adversary's control. The adversary can accomplish this by compromising a previously good web page, by [[phishing]] the user to a web page allowing the adversary to load arbitrary code, or by using a malicious advertisement on an otherwise safe web page.<ref name=":1" /><ref>{{Cite journal |last1=Sanchez-Rola |first1=Iskander |last2=Balzarotti |first2=Davide |last3=Santos |first3=Igor |date=22 December 2020 |title=Cookies from the Past: Timing Server-side Request Processing Code for History Sniffing |journal=Digital Threats: Research and Practice |volume=1 |issue=4 |pages=24:1–24:24 |doi=10.1145/3419473 |doi-access=free }}</ref> While most history sniffing attacks do not require user interactions, specific variants of the attacks need users to interact with particular elements which can often be disguised as buttons, browser games, [[CAPTCHA|CAPTCHAs]], and other such elements.<ref name=":0" />


== Modern variants ==
== Modern variants ==
Despite being partially mitigated in 2010, history sniffing is still considered an unsolved issue. In 2011, researchers at [[Carnegie Mellon University]] showed that while the defences proposed by Mozilla were sufficient to prevent most non-interactive attacks, such as those found by Jang et al., they were ineffective against interactive attacks. By showing users overlayed letters, numbers and patterns, which would only reveal themselves if a user had visited a specific website while solving CAPTCHAs, playing chess games or guessing letter combinations, the researchers could show that history sniffing was still viable via interactive attacks.<ref>{{Cite book |last1=Kikuchi |first1=Hiroaki |url=https://ieeexplore.ieee.org/document/7794539 |title=2016 10th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS) |last2=Sasa |first2=Kota |last3=Shimizu |first3=Yuta |date=2016 |publisher=IEEE |isbn=978-1-5090-0984-8 |pages=599–602 |language=en-US |chapter=Interactive History Sniffing Attack with Amida Lottery |doi=10.1109/IMIS.2016.109 |access-date=2023-10-30 |s2cid=32216851}}</ref><ref name=":0" />
Despite being partially mitigated in 2010, history sniffing is still considered an unsolved problem.<ref name=":1" /> In 2011, researchers at [[Carnegie Mellon University]] showed that while the defences proposed by Mozilla were sufficient to prevent most non-interactive attacks, such as those found by ''Jang et al.'', they were ineffective against interactive attacks. By showing users overlaid letters, numbers and patterns, which would only reveal themselves if a user had visited a specific website, the researchers were able to trick 307 participants into potentially revealing their browsing history via history sniffing. This was done by presenting the activities in the form of pattern solving problems, chess games and CAPTCHAs.<ref>{{Cite book |last1=Kikuchi |first1=Hiroaki |url=https://ieeexplore.ieee.org/document/7794539 |title=2016 10th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS) |last2=Sasa |first2=Kota |last3=Shimizu |first3=Yuta |date=2016 |publisher=IEEE |isbn=978-1-5090-0984-8 |pages=599–602 |language=en-US |chapter=Interactive History Sniffing Attack with Amida Lottery |doi=10.1109/IMIS.2016.109 |access-date=30 October 2023 |s2cid=32216851 |archive-date=6 June 2018 |archive-url=https://web.archive.org/web/20180606235308/https://ieeexplore.ieee.org/document/7794539/ |url-status=live }}</ref><ref name=":0" />


In 2018, researchers in [[University of California, San Diego|University of California, San Deigo]] demonstrated timing attacks that could bypass the mitigations introduced by Mozilla. By abusing the CSS paint API and targeting the byte-code cache of the browser, the researchers were able to time the amount of time it took to paint specific links. They thus were able to provide probabilistic techniques for identifying visited websites.<ref>{{Cite web |last=Haskins |first=Caroline |date=2018-11-02 |title=Old School 'Sniffing' Attacks Can Still Reveal Your Browsing History |url=https://www.vice.com/en/article/zm9jd4/old-school-sniffing-attacks-can-still-reveal-your-browsing-history |access-date=2023-10-30 |website=Vice |language=en}}</ref><ref>{{Cite journal |last=Smith |first=Michael |last2=Disselkoen |first2=Craig |last3=Narayan |first3=Shravan |last4=Brown |first4=Fraser |last5=Stefan |first5=Deian |date=2018 |title=Browser history {re:visited} |url=https://www.usenix.org/conference/woot18/presentation/smith |journal=OFFENSIVE TECHNOLOGIES. USENIX WORKSHOP. 12TH 2018. (WOOT'18) |language=en |s2cid=51939166}}</ref>
In 2018, researchers at the [[University of California, San Diego]] demonstrated timing attacks that could bypass the mitigations introduced by Mozilla. By abusing the CSS paint API (which allows developers to draw a background image programmatically) and targeting the [[bytecode]] [[Web cache|cache]] of the browser, the researchers were able to time the amount of time it took to paint specific links. Thus, they were able to provide probabilistic techniques for identifying visited websites.<ref>{{Cite web |last=Haskins |first=Caroline |date=2 November 2018 |title=Old School 'Sniffing' Attacks Can Still Reveal Your Browsing History |url=https://www.vice.com/en/article/zm9jd4/old-school-sniffing-attacks-can-still-reveal-your-browsing-history |access-date=30 October 2023 |website=Vice |language=en}}</ref><ref>{{Cite journal |last1=Smith |first1=Michael |last2=Disselkoen |first2=Craig |last3=Narayan |first3=Shravan |last4=Brown |first4=Fraser |last5=Stefan |first5=Deian |date=2018 |title=Browser history {re:visited} |url=https://www.usenix.org/conference/woot18/presentation/smith |journal=Offensive Technologies. Usenix Workshop. 12th 2018. (Woot'18) |language=en |s2cid=51939166}}</ref>


In recent years, multiple attacks have been found targeting various newer features provided by browsers. In 2020, Sanchez-Rola et al. demonstrated that by measuring the time a server takes to respond to a request with cookies and then comparing it to how long it took for a server to respond without cookies, a website could perform history sniffing.<ref>{{Cite journal |last1=Sanchez-Rola |first1=Iskander |last2=Balzarotti |first2=Davide |last3=Santos |first3=Igor |date=2020-12-22 |title=Cookies from the Past: Timing Server-side Request Processing Code for History Sniffing |url=https://dl.acm.org/doi/10.1145/3419473 |journal=Digital Threats: Research and Practice |volume=1 |issue=4 |pages=24:1–24:24 |doi=10.1145/3419473|s2cid=229716038 }}</ref> In 2023, Ali et al. demonstrated that newly introduced browser features could be abused also to perform history sniffing. One particularly notable example highlighted was the fact that a recently introduced feature, the Private Tokens API, introduced under Google's [[Privacy Sandbox]] initiative with an intension to prevent user tracking, could allow malicious actors to exfiltrate a users browsing data by using techniques similar to those used for [[Cross-site leaks|cross-site leak attacks]].<ref>{{Cite journal |last1=Ali |first1=Mir Masood |last2=Chitale |first2=Binoy |last3=Ghasemisharif |first3=Mohammad |last4=Kanich |first4=Chris |last5=Nikiforakis |first5=Nick |last6=Polakis |first6=Jason |date=2023 |title=Navigating Murky Waters: Automated Browser Feature Testing for Uncovering Tracking Vectors |url=https://dx.doi.org/10.14722/ndss.2023.24072 |journal=Proceedings 2023 Network and Distributed System Security Symposium |location=Reston, VA |publisher=Internet Society |doi=10.14722/ndss.2023.24072|isbn=978-1-891562-83-9 |s2cid=257502501 }}</ref>
Since 2019, multiple history sniffing attacks have been found targeting various newer features browsers provide. In 2020, ''Sanchez-Rola et al.'' demonstrated that by measuring the time a server takes to respond to a request with [[HTTP cookie|HTTP cookies]] and then comparing it to how long it took for a server to respond without cookies, a website could perform history sniffing.<ref>{{Cite journal |last1=Sanchez-Rola |first1=Iskander |last2=Balzarotti |first2=Davide |last3=Santos |first3=Igor |date=22 December 2020 |title=Cookies from the Past: Timing Server-side Request Processing Code for History Sniffing |journal=Digital Threats: Research and Practice |volume=1 |issue=4 |pages=24:1–24:24 |doi=10.1145/3419473 |s2cid=229716038 |doi-access=free }}</ref> In 2023, ''Ali et al.'' demonstrated that newly introduced browser features could be abused also to perform history sniffing. One particularly notable example highlighted was the fact that a recently introduced feature, the Private Tokens API, introduced under Google's [[Privacy Sandbox]] initiative with an intention to prevent user tracking, could allow malicious actors to exfiltrate users browsing data by using techniques similar to those used for [[Cross-site leaks|cross-site leak attacks]].<ref>{{Cite journal |last1=Ali |first1=Mir Masood |last2=Chitale |first2=Binoy |last3=Ghasemisharif |first3=Mohammad |last4=Kanich |first4=Chris |last5=Nikiforakis |first5=Nick |last6=Polakis |first6=Jason |date=2023 |title=Navigating Murky Waters: Automated Browser Feature Testing for Uncovering Tracking Vectors (ABTUTV) |journal=Proceedings 2023 Network and Distributed System Security Symposium |location=Reston, VA |publisher=Internet Society |doi=10.14722/ndss.2023.24072|isbn=978-1-891562-83-9 |s2cid=257502501 |doi-access=free }}</ref>


== References ==
== References ==
{{reflist}}
{{reflist}}

{{Information security}}


[[Category:Web security exploits]]
[[Category:Web security exploits]]

Latest revision as of 12:16, 9 March 2024

History sniffing is a class of web vulnerabilities and attacks that allow a website to track a user's web browsing history activities by recording which websites a user has visited and which the user has not. This is done by leveraging long-standing information leakage issues inherent to the design of the web platform, one of the most well-known of which includes detecting CSS attribute changes in links that the user has already visited.

Despite being known about since 2002, history sniffing is still considered an unsolved problem. In 2010, researchers revealed that multiple high-profile websites had used history sniffing to identify and track users. Shortly afterwards, Mozilla and all other major web browsers implemented defences against history sniffing. However, recent research has shown that these mitigations are ineffective against specific variants of the attack and history sniffing can still occur via visited links and newer browser features.

Background[edit]

Early browsers such as Mosaic and Netscape Navigator were built on the model of the web being a set of statically linked documents known as pages. In this model, it made sense for the user to know which documents they had previously visited and which they hadn't, regardless of which document was referring to them.[1] Mosaic, one of the earliest graphical web browsers, used purple links to show that a page had been visited and blue links to show pages that had not been visited.[2][3] This paradigm stuck around and was subsequently adopted by all modern web browsers.[4]

Over the years, the web evolved from its original model of static content towards more dynamic content. In 1995, employees at Netscape added a scripting language, Javascript, to its flagship web browser, Netscape Navigator. This addition allowed users to add interactivity to the web page via executing Javascript programs as part of the rendering process.[5][6] However, this addition came with a new security problem, that of these Javascript programs being able to access each other's execution context and sensitive information about the user. As a result, shortly afterwards, Netscape Navigator introduced the same-origin policy. This security measure prevented Javascript from being able to arbitrarily access data in a different web page's execution context.[7] However, while the same-origin policy was subsequently extended to cover a large variety of features introduced before its existence, it was never extended to cover hyperlinks since it was perceived to hurt the user's ability to browse the web.[4] This innocuous omission would manifest into one of the well known and earliest forms of history sniffing known on the web.[8]

History[edit]

Text with two links, one titled leukemia is purple, the other is not.
By extracting the colour of certain links, a website can access personally identifiable information. In this example, the website could infer that the user might be interested in leukemia, a form of blood cancer.

One of the first publicly disclosed reports of a history sniffing exploit was made by Andrew Clover from Purdue University in a mailing list post on BUGTRAQ in 2002. The post detailed how a malicious website could use Javascript to determine if a given link was of a specific colour, thus revealing if the link had been previously visited.[9] While this was initially thought of to be a theoretical exploit with little real-world value, later research by Jang et al. in 2010 revealed that high-profile websites were using this technique in the wild to reveal user browsing data.[10] As a result multiple lawsuits were filed against the websites that were found to have used history sniffing alleging a violation of the Computer Fraud and Abuse Act of 1986.[8]

In the same year, L. David Baron from Mozilla Corporation developed a defence against the attack that all major browsers would later adopt. The defence included restrictions against what kinds of CSS attributes could be used to style visited links. The ability to add background images and CSS transitions to links was disallowed. Additionally, visited links would be treated identically to standard links, with Javascript application programming interfaces (APIs) that allow the website to query the color of specific elements returning the same attributes for a visited link as those for non-visited links. This ensured malicious websites could not simply infer a person's browsing history by querying the colour changes.[11]

In 2011, research by then-Stanford graduate student Jonathan Mayer found that advertising company Epic Marketplace Inc. had used history sniffing to collect information about the browsing history of users across the web.[12][13] A subsequent investigation by the Federal Trade Commission (FTC) revealed that Epic Marketplace had used history sniffing code as a part of advertisements in over 24,000 web domains, including ESPN and Papa Johns. The Javascript code allowed Epic Marketplace to track if a user has visited any of over 54,000 domains.[14][15] The resulting data was subsequently used by Epic Marketplace to categorize users into specific groups and serve advertisements based on the websites the user had visited. As a result of this investigation, the FTC banned Epic Marketplace Inc. from conducting any form of online advertising and marketing for twenty years and was ordered to permanently delete the data it had collected.[16][15]

Threat model[edit]

The threat model of history sniffing relies on the adversary being able to direct the victim to a malicious website entirely or partially under the adversary's control. The adversary can accomplish this by compromising a previously good web page, by phishing the user to a web page allowing the adversary to load arbitrary code, or by using a malicious advertisement on an otherwise safe web page.[8][17] While most history sniffing attacks do not require user interactions, specific variants of the attacks need users to interact with particular elements which can often be disguised as buttons, browser games, CAPTCHAs, and other such elements.[4]

Modern variants[edit]

Despite being partially mitigated in 2010, history sniffing is still considered an unsolved problem.[8] In 2011, researchers at Carnegie Mellon University showed that while the defences proposed by Mozilla were sufficient to prevent most non-interactive attacks, such as those found by Jang et al., they were ineffective against interactive attacks. By showing users overlaid letters, numbers and patterns, which would only reveal themselves if a user had visited a specific website, the researchers were able to trick 307 participants into potentially revealing their browsing history via history sniffing. This was done by presenting the activities in the form of pattern solving problems, chess games and CAPTCHAs.[18][4]

In 2018, researchers at the University of California, San Diego demonstrated timing attacks that could bypass the mitigations introduced by Mozilla. By abusing the CSS paint API (which allows developers to draw a background image programmatically) and targeting the bytecode cache of the browser, the researchers were able to time the amount of time it took to paint specific links. Thus, they were able to provide probabilistic techniques for identifying visited websites.[19][20]

Since 2019, multiple history sniffing attacks have been found targeting various newer features browsers provide. In 2020, Sanchez-Rola et al. demonstrated that by measuring the time a server takes to respond to a request with HTTP cookies and then comparing it to how long it took for a server to respond without cookies, a website could perform history sniffing.[21] In 2023, Ali et al. demonstrated that newly introduced browser features could be abused also to perform history sniffing. One particularly notable example highlighted was the fact that a recently introduced feature, the Private Tokens API, introduced under Google's Privacy Sandbox initiative with an intention to prevent user tracking, could allow malicious actors to exfiltrate users browsing data by using techniques similar to those used for cross-site leak attacks.[22]

References[edit]

  1. ^ "WorldWideWeb: Proposal for a HyperText Project". www.w3.org. Archived from the original on 29 June 2023. Retrieved 15 November 2023.
  2. ^ "Why are hyperlinks blue? | The Mozilla Blog". blog.mozilla.org. Archived from the original on 15 November 2023. Retrieved 15 November 2023.
  3. ^ "EMail Msg". ksi.cpsc.ucalgary.ca. Archived from the original on 15 November 2023. Retrieved 15 November 2023.
  4. ^ a b c d Weinberg, Zachary; Chen, Eric Y.; Jayaraman, Pavithra Ramesh; Jackson, Collin (2011). "I Still Know What You Visited Last Summer: Leaking Browsing History via User Interaction and Side Channel Attacks". 2011 IEEE Symposium on Security and Privacy. IEEE. pp. 147–161. doi:10.1109/SP.2011.23. ISBN 978-1-4577-0147-4. S2CID 10662023. Archived from the original on 24 December 2022. Retrieved 30 October 2023.
  5. ^ "JavaScript 1.0 – 1995". www.webdesignmuseum.org. Archived from the original on 7 August 2020. Retrieved 19 January 2020.
  6. ^ "Welcome to Netscape Navigator Version 2.0". netscape.com. 14 June 1997. Archived from the original on 14 June 1997. Retrieved 16 February 2020.
  7. ^ "Netscape 3.0 Handbook – Advanced topics". netscape.com. Archived from the original on 8 August 2002. Retrieved 16 February 2020. Navigator version 2.02 and later automatically prevents scripts on one server from accessing properties of documents on a different server.
  8. ^ a b c d Van Goethem, Tom; Joosen, Wouter; Nikiforakis, Nick (12 October 2015). "The Clock is Still Ticking: Timing Attacks in the Modern Web". Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. CCS '15. New York, NY, USA: Association for Computing Machinery. pp. 1382–1393. doi:10.1145/2810103.2813632. ISBN 978-1-4503-3832-5. S2CID 17705638.
  9. ^ "Bugtraq: CSS visited pages disclosure". seclists.org. Archived from the original on 16 November 2023. Retrieved 16 November 2023.
  10. ^ Jang, Dongseok; Jhala, Ranjit; Lerner, Sorin; Shacham, Hovav (4 October 2010). "An empirical study of privacy-violating information flows in JavaScript web applications". Proceedings of the 17th ACM conference on Computer and communications security. CCS '10. New York, NY, USA: Association for Computing Machinery. pp. 270–283. doi:10.1145/1866307.1866339. ISBN 978-1-4503-0245-6. S2CID 10901628.
  11. ^ "privacy-related changes coming to CSS:visited – Mozilla Hacks – the Web developer blog". Mozilla Hacks – the Web developer blog. Archived from the original on 7 June 2023. Retrieved 16 November 2023.
  12. ^ "Tracking the Trackers: To Catch a History Thief". cyberlaw.stanford.edu. Archived from the original on 16 November 2023. Retrieved 16 November 2023.
  13. ^ Goodin, Dan. "Marketer taps browser flaw to see if you're pregnant". www.theregister.com. Archived from the original on 16 November 2023. Retrieved 16 November 2023.
  14. ^ "FTC Final Order Prohibits Epic Marketplace From "History Sniffing"". JD Supra. Archived from the original on 16 November 2023. Retrieved 16 November 2023.
  15. ^ a b "FTC Settlement Puts an End to "History Sniffing" by Online Advertising Network Charged With Deceptively Gathering Data on Consumers". Federal Trade Commission. 5 December 2012. Archived from the original on 16 November 2023. Retrieved 16 November 2023.
  16. ^ Gross, Grant (5 December 2012). "US FTC bars advertising firm from sniffing browser histories". Computerworld. Archived from the original on 16 November 2023. Retrieved 16 November 2023.
  17. ^ Sanchez-Rola, Iskander; Balzarotti, Davide; Santos, Igor (22 December 2020). "Cookies from the Past: Timing Server-side Request Processing Code for History Sniffing". Digital Threats: Research and Practice. 1 (4): 24:1–24:24. doi:10.1145/3419473.
  18. ^ Kikuchi, Hiroaki; Sasa, Kota; Shimizu, Yuta (2016). "Interactive History Sniffing Attack with Amida Lottery". 2016 10th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS). IEEE. pp. 599–602. doi:10.1109/IMIS.2016.109. ISBN 978-1-5090-0984-8. S2CID 32216851. Archived from the original on 6 June 2018. Retrieved 30 October 2023.
  19. ^ Haskins, Caroline (2 November 2018). "Old School 'Sniffing' Attacks Can Still Reveal Your Browsing History". Vice. Retrieved 30 October 2023.
  20. ^ Smith, Michael; Disselkoen, Craig; Narayan, Shravan; Brown, Fraser; Stefan, Deian (2018). "Browser history {re:visited}". Offensive Technologies. Usenix Workshop. 12th 2018. (Woot'18). S2CID 51939166.
  21. ^ Sanchez-Rola, Iskander; Balzarotti, Davide; Santos, Igor (22 December 2020). "Cookies from the Past: Timing Server-side Request Processing Code for History Sniffing". Digital Threats: Research and Practice. 1 (4): 24:1–24:24. doi:10.1145/3419473. S2CID 229716038.
  22. ^ Ali, Mir Masood; Chitale, Binoy; Ghasemisharif, Mohammad; Kanich, Chris; Nikiforakis, Nick; Polakis, Jason (2023). "Navigating Murky Waters: Automated Browser Feature Testing for Uncovering Tracking Vectors (ABTUTV)". Proceedings 2023 Network and Distributed System Security Symposium. Reston, VA: Internet Society. doi:10.14722/ndss.2023.24072. ISBN 978-1-891562-83-9. S2CID 257502501.