NSA Book Obtained by FOIA Details Google Hacking to Find Information Never Meant to Be Published

NSA Headquarters in Fort Meade, Maryland. (Photo: Wikimedia)

Much of the National Security Agency is under a shroud of secrecy, but a recently released document is giving us some insight into how its cyber spies were trained to use search engines and other online tools.

The more than 600-page document — Untangling the Web: A Guide to Internet Research — was obtained through a Freedom of Information Act request made through MuckRock, according to Wired. It should be noted that the guide is from several years ago, and much of the Internet, search engines and online security have continued to evolve since.  NSA expressly also states on the cover of the book that the opinions of its authors, whose names are redacted, does not represent NSA’s official position.

NSA Book Obtained by FOIA Details Google Hacking to Find Information Never Meant to Be Published

(Image: NSA.gov)

Before being released through FIOA, the book published by NSA’s Center for Digital Content was only to be used by the “original recipient” who could make copies to distribute “only within the recipient’s agency or organization.”

Many chapters are rather basic, but some tech sites are calling out a more interesting chapter titled “Google Hacking,” which delves into how to find information on the Web that was never intended to be published.

“‘Google (or search engine) hacking’ involves using publicly available search engines to access publicly available information that almost certainly was not intended for public distribution,” the authors wrote. “In short, it’s using clever but legal techniques.”

This type of information includes:

  • Personal or financial information
  • User IDs, computer or account logins, passwords
  • Private, confidential or proprietary company data
  • Sensitive government information
  • Vulnerabilities in websites and servers that could facilitate breaking into the site

“Normally, one would have to be actively looking for this type of information,” Untangled states. “Of course, many documents Google hackers find using these techniques are not sensitive and indeed are intended for public Internet. Only a tiny fraction of the over eight billion pages in the Google index were not meant to be made available to the public.”

Wired has more of an explanation on what this chapter detailed:

Say you’re a cyberspy for the NSA and you want sensitive inside information on companies in South Africa. What do you do?

Search for confidential Excel spreadsheets the company inadvertently posted online by typing “filetype:xls site:za confidential” into Google, the book notes.

Want to find spreadsheets full of passwords in Russia? Type “filetype:xls site:ru login.” Even on websites written in non-English languages the terms “login,” “userid,” and “password” are generally written in English, the authors helpfully point out.

Misconfigured web servers “that list the contents of directories not intended to be on the web often offer a rich load of information to Google hackers,” the authors write, then offer a command to exploit these vulnerabilities — intitle: “index of” site:kr password.

The authors assure readers once again that nothing they are revealing is “anything that is not already widely known and used on the Internet by both legitimate and illicit Google hackers.”

Although perhaps a tad outdated, as the Verge pointed out, “it’s a fascinating look at what the NSA was thinking back in 2007.”