If you're looking for a comprehensive guide to OSINT and Google dorking, you've come to the right place. In this blog post, we'll discuss what OSINT is and how to use Google dorks to find sensitive information online. Stay tuned, because by the end of this post, you'll be a master at using OSINT techniques.
What is Google Dorking?
Google dorking is a hacking technique that makes use of Google's advanced search services to locate valuable data or hard-to-find content. Google dorking is also known as "Google hacking."
Researchers have found that they could take advantage of the wider extra options to find things via Google that you don't expect, which narrows it down to very specific things. For example, finding vulnerabilities in systems. Google is an incredibly powerful search engine, more powerful than a lot of other companies such as Apple.
What Are Some Examples of This?
One example is LinkedIn, LinkedIn is very difficult to scrape, compared to a lot other websites out there and the site does this on purpose as they don't want people to be able to find peoples names and sensitive details. Google is searching and indexing all of the time. Another example is documents, stripping metadata from documents, these often contain usernames hidden in the documents, the first step is actually getting hold of those documents. With google you can crack this by extracting metadata through the google tools which leads to finding very sensitive information. This also collides with social media, being able to cross reference people on social media sites to google maps to actually find where people live. This can happen in such a short amount of time, 40 to 50 minutes from finding your name all the way to your residential address.
How Serious is Finding Someones Residential Address?
This is an interesting debate, and some may think it's not a huge concern as many people may know where they live, but they often don't think about how many times they have given the first line of their address over the phone to companies and businesses. Worst case scenarios include being swatted, little bits of these personal bits of information can be damaging in the wrong hands. People need to be aware of this and be able to remove it if possible.
Unfortunately with a lot of OSINT, you cant just switch it off, you need to understand what's out there and the risk it poses from addresses to usernames, once its out there, its hard to get it back.
How Can we Make People More Aware?
First of all, you can try and make people aware that's its now publicly available information. And that shouldn't be used as a factor of authentication. Implementing things such as multi-factor authentication on publicly available login portals or cloud environments that again, would help to mitigate against the risk of having this information out of that process.
To conclude, this blog, alongside the podcast, is the ultimate guide to OSINT and Google Hacking (or google dorking) with the goal in mind to protect you from having sensitive information leaked online. This blog highlights the key solutions to avoid your sensitive information being online and the precautions you can take going forward.
To view our Podcast version of this blog go here: https://open.spotify.com/episode/2ioc7Qpywr8ZdubV9wm4tI?si=c61d4e3ee94e475a *
Here at DarkInvader we provide a thorough Threat Intelligence service that combines automated tools with human research and OSINT, to learn how we can help your business get in touch.
*For those hard of hearing please see a full transcript of the TechBites 01 podcast:
Hello and welcome to the first of many dark invader Osen deep dives. I'm joined by our technical director Gavin Watson, and one of the research team, Liam Follin. To discuss the magic that is Google Dorking. Without further ado, Gavin, what is Google Dorking?
Gavin Watson (Technical Director)
So Google Dorking, it's sometimes called Google hacking as well. And it's, it's using the Google search engine in a more advanced way. Basically, it's using some of the the extra syntax options that Google offers. Now, this can be really, really simple. And I think a lot of people are aware of some of the extra little commands you can put into a Google search box. So for example, if you wanted to search something on the BBC website, and you wanted to make sure that you only receive results from the BBC website, and nothing else, then you can type site colon bbc.com, for example. Now, what people found was that they could take advantage of the wide variety of these extra options to find things via Google that you really wouldn't expect and to, to really narrow the search down to very, very specific things. So for, for example, it's possible to find vulnerabilities in in people's systems, it is possible to, to map an attack surface for a target, you can, you can use these options to generate lists of potential usernames. And these are all really, really useful types of information for an attacker, or in terms of us for a pen tester or a security consultant.
Liam, do you have anything to add on to that?
Liam (Security Consultant)
No, that was a very eloquent, no effectively would just be a Google's an incredibly powerful search engine, more powerful than Apple. And I think a lot of people realise, and all we're doing is leveraging that power slightly more efficiently, to allow us to find out information about people to tech services come from the security side, or coming from the research side of time, and just trying to find out information about businesses that could then be leveraged to harm and damage. I think there's some pretty good examples of this. I know, you've worked on certain tools that use Google Dorking to specifically around LinkedIn.
Gavin Watson (Technical Director)
Absolutely. So LinkedIn is very difficult to scrape, relatively speaking, compared to a lot of other websites out there. And that's on purpose. The LinkedIn don't want people to be able to scrape names and addresses because at the end of the day, LinkedIn contains not not particularly sensitive information. But if it could be scraped on mass, then that could be quite an issue, getting hold of huge amounts of names and employment history and people's likes, endorsements, things like that, you know, they don't want people to scrape for it for a variety of reasons. But Google is searching and indexing all of the time. And so you can scrape LinkedIn, to an extent via Google. So and how this works is very, very simplistic. Just going back to what I said previously, is you can refine the search with site coal on LinkedIn. And then you can use some of the other syntax like in title, for example, so that the results you get back from Google, every single one is an individual LinkedIn user for the company you are interested in. And that's quite a powerful thing. You can't scrape LinkedIn, but you kind of have scraped the names of the your target company via Google. And then with a little bit of coding, you can change that output into a valid list of names or usernames. And this is half the battle. If you're going to attack a company. If you're going to brute force a login portal, you need usernames and names. And with a little bit of more coding, you can change though that first name and that last name into different email conventions at first first dot last or first initial, and then the whole surname, because you might be because you might not know what convention the company uses. And so with a very simple Google hack, or Google Doc, if you will, and a little bit of coding, then you can, you can generate a huge, a huge list of potential usernames. And then it's just a case of picking a few common passwords password one password 123 And it's a number scaling it depending on how many you're you're sending in, it only takes that one individual to have a weak password and then you potentially you're, you're into a system. And then the Another example is the documents as well. And stripping out metadata from documents is a really common strategy for getting information because documents hosted on people's websites often contain usernames, and they don't realise those usernames are hidden in those documents. But the first step is getting hold of those documents. And how do you do that. And again, with simple Google hacks, you can accomplish this. So you can search for a company name. And then just put something simple like file type colon PDF, and all the results you get back will be PDFs for that company, then you can download them all manually, if you want. And then extract that metadata using the various freely available tools like EXIF tool and things like that. And there are there are tools out there that automate this process. Like the old tools, I met meta Google, for example. But the point is, you can do it very easily, manually yourself just with Google. And think there's, there's been quite a lot of instances recently, war stories, if you will, I guess, you know, where we've used these kind of Google hacks. And we found, you know, really quite sensitive information.
Liam (Security Consultant)
Not Absolutely. And, again, kind of leveraging that, that powerful, powerful search engine, to try and find things that are some really good examples, given that. We've also used it to find people's residential addresses, by piecing together from from various social media sites, piecing together this image of a person, how they normally present themselves on social media sites. And we were then able to use the event information and cross referencing with things like Google Maps, for example. And we're actually able to find where somebody's lead. Just simply based on again, using these Dorking strings, using these using this way of interacting with Google. And again, in a reasonably short amount of time, it only took about 4550 minutes for the, for the researcher in question to actually go from funding to funding and then all the way through to to have this residential address, which naturally I'm sure most people don't really want their their residential addresses to be to be publicly known online or to be found by the sorts of people that are going to be doing this, this research. And there are other examples, as well away by Google Dorking, this has revealed recruiting sensitive information not perhaps as serious as where exactly where somebody lives. But you can use it for example, defined supplements. Or, more specifically, you can use it to find an s3 buckets that have been crawled by by Google. And once you've found that s3 bucket, that's a very common misconfiguration, s3 buckets, leaving them open public. And that lists all the files in all the first 1000 files strictly speaking, and we one of the researchers actually managed to again, using these very simple site in URL, Google, Google Dorking, Google hacking the streets, they were able to find quite extensive list of driving licences and passports that had just been left in an s3 bucket, presumably by some of the HR department for one of these businesses. And naturally, that's, again, was quite a very simple technique. But the damage can be caused as a result of that, or the potential for damage, or some sort of somebody just using these very simple techniques is, is almost in
Gavin Watson (Technical Director)
The residential addresses quite an interesting one. I mean, some people might think, well, so what you know, I tell people where I saw loads of people were live, but if you think about how many times have you been on a phone call to a business, and they've authenticated you by saying clutched in first line in new drinks? You know, it's, that's can be quite, quite serious. And in the worst case scenarios here about people being swatted, when they, you know, they call in a gun crime, a particular residential address to call SWAT team and things like that. So, you know, this, the little bits of personally identifiable information like that in the you know, in the wrong hands, can be really quite damaging. So it's, it's important to, to be aware of that is out there and make it possible to get it removed.
Liam (Security Consultant)
Absolutely, which which kind of nicely segues into, you know, what can you what can you do about this? Alright, so, we've, we've expanded upon over the last couple of minutes, you know, what you can, you can find out there, but now we have found something out there, what are we going to be able to do about it? And unfortunately, with a lot of this open source intelligence, you can't just kind of switch it off. You've got to, you need to understand what's out there. And you also need to understand the risk of that information process and that will be very different. from, you know, same residential address, or I can use a name. But once the information is out there, as I said, it's very hard to get it removed from the internet. That's kind of not really how the internet works. But you can do put in place series of compensatory controls. Alright, so residential addresses ever. That's fine. What are we going to be able to do about it? Well, first of all, you can try and make people aware that that's now publicly information, publicly available information and that shouldn't be used as a factor of authentication. In an extreme case, you can also move obviously, that's, that's probably slightly, slightly over the top. But again, that might be necessary. If, for example, we've used Google Dorking, or it's possible to use Google Chrome to find lists of usernames for your your of your business, then implementing things like multi factor authentication on publicly available login portals on cloud environments that again, would help to mitigate against the risk of having this information out that process.
Thank you both for talking us through the first of many hours of deep dives. Join us next week.