Hacker
Professional
- Messages
- 1,043
- Reaction score
- 844
- Points
- 113
The content of the article
According to American intelligence officials, about eighty percent of useful information comes from publicly available sources. And only a small part is intelligence obtained directly by agents. The difference is huge, but we must not forget that it is the details obtained by intelligence that make it possible to confirm and verify the overall picture.
This is well illustrated by the story of how the CIA in Soviet times restored a scheme for the electrification of the Urals and nuclear industry enterprises based on a photograph from a magazine. Intelligence analysts stuck with one retouched image of a dashboard and then did a great job collecting data from Soviet newspapers, magazines and diplomatic mission reports. But the photographs of the terrain from reconnaissance balloons turned out to be decisive, because otherwise it was impossible to confirm the validity of the conclusions about the location of power transmission lines and factories.
It is equally important to be able to get "intelligence" when it comes to gathering information about people. This can be an investigation of an infection and a search for the culprit, or identification of an individual to clarify the surface of the attack. Next, we will consider extraordinary cases of obtaining information when collection from open sources does not give the desired result.
We break through the Telegram bot
As you know, Telegram has a wide range of functionality for working with bots. It doesn't cost anything to create a new one, get a token for management and use it in a ready-made script from GitHub. It is just as easy for tokens to leak from bots: you can find them by elementary dorks. And sometimes you just come across malware that uses Telegram functions as a control panel - if you only knew how many stealer archives are stored in the messenger's cloud!
What stops the bot from pulling all the information through the leaked token? Of course, this is the Telegram API, which limits multiple connections for bots from different locations. And even if the bot does not work or its owner ignores connection errors, we will still be able to receive only new, previously unprocessed messages. Finally, the bot has no knowledge of who it has communicated with in the past. Thus, you won't be able to read the story - you need to quietly wait for someone to write to intercept the message.
Fortunately, the HTTP API for working with bots is just an add-on to the original MTProto API, which allows you to interact with them as with human accounts. And there are ready-made libraries for working with this protocol! Personally, I recommend Telethon, which recently implemented a layer for managing bots.
As you can imagine, the possibilities with this method increase by an order of magnitude. Now we can connect unnoticed by the bot owner and pull out the message history. And the first dialogue, obviously, will be with its owner. I wrote a simple script that drags correspondence, saves media content, names and photos of interlocutors.
And by the way, you probably remember that recently Telegram added the function of deleting your messages from the interlocutor. So, if you delete a message from the dialogue with the bot, then it will remain with him.
Find out the phone number in zero touches
Imagine that you need to find out the phone number in the pocket of another person. And do not attract attention! Of course, there are many ways to do this, ranging from brutal interception through the IMSI catcher to social engineering. Until 2018, Wi-Fi in the Moscow metro easily sent the phone number and advertising profile of its owner via MAC.
Researchers from Hexway recently published information on how Apple devices communicate with each other via Bluetooth LE. For mutual identification, they use hashes from the person's account data: three bytes from SHA-256 of the phone number, Apple ID and mailbox. Yes, yes, iPhones, MacBooks and even headphones literally scream into the air around them who they are and what operating mode they are in.
Fortunately, hashes are not always sent, but only in specific cases: for example, when opening screens for entering a password from a wireless network. But this is enough: offer the person to connect to Wi-Fi - and you have the hashes from his data.
In addition, remember that the Russian number consists of eleven digits and in three bytes there will actually be only a few dozen collisions. And even less if we know exactly the region and the operator. The authors of the study have made a convenient script that allows you to generate hashes for the required ranges. Some enthusiasts have already created a ready-made database by cleaning out non-existent operator codes. So after receiving the list of numbers, it remains only to narrow down the number of "suspects" using HLR requests to break through the subscriber's activity and check if he has messenger accounts. A couple of iterations - and you have the number of the desired mobile phone in your hands.
Also, don't forget about Apple ID and mailbox! Now you have all the data to verify the account information of a person that you find in open sources.
Search email via network honeypot
Now let's assume that we are in the opposite situation: we can contact a person only through a network, for example, the Jabber messenger. We know that he is developing malware, but he leads a double life and has an official job. We'd love to find out his identity, but he exclusively uses Tor and VPN. What can we catch in this case?
As each person has a characteristic handwriting, so the code of different people often differs in small details. What if we have a script from the search object in our hands? Yes, we can search for key strings in open repositories in the hope that in real life he wrote or reused his code somewhere. Thus, we will go to the account, download all available repositories from it and get a list of people and mailing addresses using a simple command
If you only knew how often people inadvertently commit under one account and push under another!
But searching by this method turns out to be inaccurate, and there are probably other ways. What can check out the active user of a Git repository? I give a tip: the public key through which the server identifies you! This is often the same SSH key that is generated in id_rsa, but double life involves multiple identities and multiple accounts. And probably some one is used to connect to servers. Do you understand what I'm getting at?
For the sake of completeness, let's clarify two more things. First: all public keys of GitHub and GitLab users are stored in the open and publicly available, just add .keys to your profile URL . Second: the SSH client, when connecting, iterates over all the keys that are explicitly specified for the server or added to the agent. This is where the whole essence of our adventure lies.
We let a person connect to our server with a patched SSH, which sends us all the keys. And then we look for it in the database of all users of the open repositories! The idea is simple and ingenious and has already been implemented in some form. Type in the console
The server will try to find your GitHub, and at the same time check on CVE-2016-0777.
The server code is open source, and at its core is a base for researching the trustworthiness of GitHub users' keys. The data itself was not published, but other people subsequently repeated the collection on their own. However, let me remind you that it is better to collect actual user data yourself: they rename accounts, create new ones, and some even change passwords.
Skeletons in the Telegram closet
How often have you had to try to find out your Telegram account information? It's too much, too often for me. Sometimes the origin is clarified by meta-information like a username (when it looks like the first and last name of a real person) or photos from the account and the date of their installation. But sometimes this information is not available, only First and Last name, absolutely uninformative.
This is what identity verification looks like in a real-life search. First, we first figure out the account ID. It is hidden in official clients, but some unofficial ones show ID when opening a profile. It is more convenient to send a message from a person to a bot @userinfobot, and he will give out the coveted numbers.
However, sometimes the method may not work due to the account's privacy settings. But there is a way out: through scripts, you can use the API to track system fields in messages and events. Subsequently, we identify a person if he changes his name, leaves general chats, or gets lost for a while.
Next, we are looking for data on special search engines. Most useful for Telegram is search.buzz.im, which indexes all open chats. It even lets you search by post author! But it won't be of any use to us if the account owner has recently changed their name. Search by ID requires a lot of luck - theoretically, he could have flashed text in open chats and indexed, but in practice this probability is low.
We also search by ID. Unfortunately, Telegram does not allow receiving account data if there is only an identifier - this is the basic level of privacy. Therefore, there are no tools for direct "penetration", except perhaps only databases that do a reverse search by ID for accounts, data about which is collected in a "legal" way (that is, when you can get data through the API, having common chats with a person).
Who has such databases of information on accounts? Obviously, these are bots that serve a huge number of open chats, closed chats and faces. And nothing prevents their developers from saving all existing account data.
And then it remains to tell only the practical part. For example, in one of my cases, a search for a source of information led to a group bot to accumulate karma. Of course, a database was used to update and synchronize the values for each account. And of course, the developers could not resist the temptation to write a simple web interface in PHP for checking karma by nickname, name and user ID.
Finding the necessary information turned out to be even easier than expected: when requested, a list of suitable accounts with a name and ID was displayed (that is, auto-completion worked). When entering the identifier, the first record from the database appeared, that is, the first username under which the bot saw it. Bingo!
But if you think that this is all, then you are greatly mistaken. An SQL injection was found in the autocompletion query, through which it was possible to download the database of all users: accounts that at least once entered into groups where there is a bot. Together with the history of the name change, the date of entry and other information. I think there is no need to voice morality here.
Hidden data from the forum
What if only one person's account is known on the site, and a search by nickname does not return anything? The fact of registration implies that someone left their personal data there and at least entered an email - we need it for further investigation. Only here's the bad luck: the level of privacy is set to the maximum, all data is seen only by the user himself.
Let's then additionally twist the level of our secrecy to the maximum: it is impossible to use SI methods on a person, since he can suspect something in the bait account and clean up traces in other important places. Of course, you can try to steal the cookies of the administrator or forum moderator, get extended rights and direct access to the database ...
However, I will give a more elegant example of how to solve such a problem. But before continuing, I warn you: I do not recommend repeating these actions, at least because they go beyond the passive collection of information.
It's hard to find an XSS vulnerability in popular forum engines these days. But does this mean that they are definitely not there? This is what the author of this case thought, and he was not mistaken: the exploit was found in the phone field in the user's profile, whose page is open to everyone. What does this mean to us?
As you might have guessed, this allows you to hijack the user's session, but even that would be overkill. Why log into a forum to view the data that is already in your hands? Yes, that's right, we'll just inject a JS script into the page, which will request information from the active user via AJAX, and then send it to our server.
But, as you remember, we agreed not to attract the attention of the object of our investigation. Then let's create the conditions under which our XSS will reach him through the wrong hands. For example, through the accounts of his acquaintances. We need one more little touch: in addition to a direct request for information, we will send an additional request via JS to save XSS in the user field. Are you already starting to see the whole picture?
We will create the most attention-grabbing account and write a shocking message on the forum. I leave it to your imagination to figure out the nuances of a specific implementation. It is not so important for us who reads this and whether they will block us at all - the account will become patient zero. Someone will go to our profile to view the information and start spreading the data collector virus. This script will be introduced in a wide wave, infecting acquaintances, acquaintances and just casual interlocutors one by one.
And so it happened. After a short time, the data of the desired user came to the command server, including the email of interest. Together with information about a good half of the rest of the members of the forum. It is a pity that it was not calculated through what handshake "the award found the hero", otherwise it would have been possible to write a study on social graphs on the forums along the way.
Let me remind you, by the way, that a similar scheme was recently used in FB, and for show, and not for collection purposes. Imagine how long can attackers exploit such vulnerabilities in a hidden way, collecting our data?
Conclusion
Here we have considered only five non-standard ways to get information that cannot be reached from open sources. There are always tricks and hacks, but keep in mind that someone could have gone this way before you: put the database on the Web, find a leak on the site, create a Telegram bot for verification. So I recommend compiling a list of the most useful sources of open information for yourself and going through it before you start digging deeper. For OSINT on the Runet, I recommend a bot with a comprehensive set of tools for all occasions - HowToFind bot.
And one more thing: do not forget to systematize the received data. Often, an elementary mindmap will allow you to see not the most obvious connections and cut off unnecessary ones. You may not need to use tricky tricks to find out what you want.
- We break through the Telegram bot
- Find out the phone number in zero touches
- Search email via network honeypot
- Skeletons in the Telegram closet
- Hidden data from the forum
- Conclusion
According to American intelligence officials, about eighty percent of useful information comes from publicly available sources. And only a small part is intelligence obtained directly by agents. The difference is huge, but we must not forget that it is the details obtained by intelligence that make it possible to confirm and verify the overall picture.
This is well illustrated by the story of how the CIA in Soviet times restored a scheme for the electrification of the Urals and nuclear industry enterprises based on a photograph from a magazine. Intelligence analysts stuck with one retouched image of a dashboard and then did a great job collecting data from Soviet newspapers, magazines and diplomatic mission reports. But the photographs of the terrain from reconnaissance balloons turned out to be decisive, because otherwise it was impossible to confirm the validity of the conclusions about the location of power transmission lines and factories.
It is equally important to be able to get "intelligence" when it comes to gathering information about people. This can be an investigation of an infection and a search for the culprit, or identification of an individual to clarify the surface of the attack. Next, we will consider extraordinary cases of obtaining information when collection from open sources does not give the desired result.
We break through the Telegram bot
As you know, Telegram has a wide range of functionality for working with bots. It doesn't cost anything to create a new one, get a token for management and use it in a ready-made script from GitHub. It is just as easy for tokens to leak from bots: you can find them by elementary dorks. And sometimes you just come across malware that uses Telegram functions as a control panel - if you only knew how many stealer archives are stored in the messenger's cloud!
What stops the bot from pulling all the information through the leaked token? Of course, this is the Telegram API, which limits multiple connections for bots from different locations. And even if the bot does not work or its owner ignores connection errors, we will still be able to receive only new, previously unprocessed messages. Finally, the bot has no knowledge of who it has communicated with in the past. Thus, you won't be able to read the story - you need to quietly wait for someone to write to intercept the message.
Fortunately, the HTTP API for working with bots is just an add-on to the original MTProto API, which allows you to interact with them as with human accounts. And there are ready-made libraries for working with this protocol! Personally, I recommend Telethon, which recently implemented a layer for managing bots.
As you can imagine, the possibilities with this method increase by an order of magnitude. Now we can connect unnoticed by the bot owner and pull out the message history. And the first dialogue, obviously, will be with its owner. I wrote a simple script that drags correspondence, saves media content, names and photos of interlocutors.
And by the way, you probably remember that recently Telegram added the function of deleting your messages from the interlocutor. So, if you delete a message from the dialogue with the bot, then it will remain with him.
Find out the phone number in zero touches
Imagine that you need to find out the phone number in the pocket of another person. And do not attract attention! Of course, there are many ways to do this, ranging from brutal interception through the IMSI catcher to social engineering. Until 2018, Wi-Fi in the Moscow metro easily sent the phone number and advertising profile of its owner via MAC.
Researchers from Hexway recently published information on how Apple devices communicate with each other via Bluetooth LE. For mutual identification, they use hashes from the person's account data: three bytes from SHA-256 of the phone number, Apple ID and mailbox. Yes, yes, iPhones, MacBooks and even headphones literally scream into the air around them who they are and what operating mode they are in.
Fortunately, hashes are not always sent, but only in specific cases: for example, when opening screens for entering a password from a wireless network. But this is enough: offer the person to connect to Wi-Fi - and you have the hashes from his data.
In addition, remember that the Russian number consists of eleven digits and in three bytes there will actually be only a few dozen collisions. And even less if we know exactly the region and the operator. The authors of the study have made a convenient script that allows you to generate hashes for the required ranges. Some enthusiasts have already created a ready-made database by cleaning out non-existent operator codes. So after receiving the list of numbers, it remains only to narrow down the number of "suspects" using HLR requests to break through the subscriber's activity and check if he has messenger accounts. A couple of iterations - and you have the number of the desired mobile phone in your hands.
Also, don't forget about Apple ID and mailbox! Now you have all the data to verify the account information of a person that you find in open sources.
Search email via network honeypot
Now let's assume that we are in the opposite situation: we can contact a person only through a network, for example, the Jabber messenger. We know that he is developing malware, but he leads a double life and has an official job. We'd love to find out his identity, but he exclusively uses Tor and VPN. What can we catch in this case?
As each person has a characteristic handwriting, so the code of different people often differs in small details. What if we have a script from the search object in our hands? Yes, we can search for key strings in open repositories in the hope that in real life he wrote or reused his code somewhere. Thus, we will go to the account, download all available repositories from it and get a list of people and mailing addresses using a simple command
Code:
$ git log —pretty="%an %ae%n%cn %ce" | sort | uniq
If you only knew how often people inadvertently commit under one account and push under another!
But searching by this method turns out to be inaccurate, and there are probably other ways. What can check out the active user of a Git repository? I give a tip: the public key through which the server identifies you! This is often the same SSH key that is generated in id_rsa, but double life involves multiple identities and multiple accounts. And probably some one is used to connect to servers. Do you understand what I'm getting at?
For the sake of completeness, let's clarify two more things. First: all public keys of GitHub and GitLab users are stored in the open and publicly available, just add .keys to your profile URL . Second: the SSH client, when connecting, iterates over all the keys that are explicitly specified for the server or added to the agent. This is where the whole essence of our adventure lies.
We let a person connect to our server with a patched SSH, which sends us all the keys. And then we look for it in the database of all users of the open repositories! The idea is simple and ingenious and has already been implemented in some form. Type in the console
Code:
$ ssh whoami.filippo.io
The server will try to find your GitHub, and at the same time check on CVE-2016-0777.
The server code is open source, and at its core is a base for researching the trustworthiness of GitHub users' keys. The data itself was not published, but other people subsequently repeated the collection on their own. However, let me remind you that it is better to collect actual user data yourself: they rename accounts, create new ones, and some even change passwords.
Skeletons in the Telegram closet
How often have you had to try to find out your Telegram account information? It's too much, too often for me. Sometimes the origin is clarified by meta-information like a username (when it looks like the first and last name of a real person) or photos from the account and the date of their installation. But sometimes this information is not available, only First and Last name, absolutely uninformative.
This is what identity verification looks like in a real-life search. First, we first figure out the account ID. It is hidden in official clients, but some unofficial ones show ID when opening a profile. It is more convenient to send a message from a person to a bot @userinfobot, and he will give out the coveted numbers.
However, sometimes the method may not work due to the account's privacy settings. But there is a way out: through scripts, you can use the API to track system fields in messages and events. Subsequently, we identify a person if he changes his name, leaves general chats, or gets lost for a while.
Next, we are looking for data on special search engines. Most useful for Telegram is search.buzz.im, which indexes all open chats. It even lets you search by post author! But it won't be of any use to us if the account owner has recently changed their name. Search by ID requires a lot of luck - theoretically, he could have flashed text in open chats and indexed, but in practice this probability is low.
We also search by ID. Unfortunately, Telegram does not allow receiving account data if there is only an identifier - this is the basic level of privacy. Therefore, there are no tools for direct "penetration", except perhaps only databases that do a reverse search by ID for accounts, data about which is collected in a "legal" way (that is, when you can get data through the API, having common chats with a person).
Who has such databases of information on accounts? Obviously, these are bots that serve a huge number of open chats, closed chats and faces. And nothing prevents their developers from saving all existing account data.
And then it remains to tell only the practical part. For example, in one of my cases, a search for a source of information led to a group bot to accumulate karma. Of course, a database was used to update and synchronize the values for each account. And of course, the developers could not resist the temptation to write a simple web interface in PHP for checking karma by nickname, name and user ID.
Finding the necessary information turned out to be even easier than expected: when requested, a list of suitable accounts with a name and ID was displayed (that is, auto-completion worked). When entering the identifier, the first record from the database appeared, that is, the first username under which the bot saw it. Bingo!
But if you think that this is all, then you are greatly mistaken. An SQL injection was found in the autocompletion query, through which it was possible to download the database of all users: accounts that at least once entered into groups where there is a bot. Together with the history of the name change, the date of entry and other information. I think there is no need to voice morality here.
Hidden data from the forum
What if only one person's account is known on the site, and a search by nickname does not return anything? The fact of registration implies that someone left their personal data there and at least entered an email - we need it for further investigation. Only here's the bad luck: the level of privacy is set to the maximum, all data is seen only by the user himself.
Let's then additionally twist the level of our secrecy to the maximum: it is impossible to use SI methods on a person, since he can suspect something in the bait account and clean up traces in other important places. Of course, you can try to steal the cookies of the administrator or forum moderator, get extended rights and direct access to the database ...
However, I will give a more elegant example of how to solve such a problem. But before continuing, I warn you: I do not recommend repeating these actions, at least because they go beyond the passive collection of information.
It's hard to find an XSS vulnerability in popular forum engines these days. But does this mean that they are definitely not there? This is what the author of this case thought, and he was not mistaken: the exploit was found in the phone field in the user's profile, whose page is open to everyone. What does this mean to us?
As you might have guessed, this allows you to hijack the user's session, but even that would be overkill. Why log into a forum to view the data that is already in your hands? Yes, that's right, we'll just inject a JS script into the page, which will request information from the active user via AJAX, and then send it to our server.
But, as you remember, we agreed not to attract the attention of the object of our investigation. Then let's create the conditions under which our XSS will reach him through the wrong hands. For example, through the accounts of his acquaintances. We need one more little touch: in addition to a direct request for information, we will send an additional request via JS to save XSS in the user field. Are you already starting to see the whole picture?
We will create the most attention-grabbing account and write a shocking message on the forum. I leave it to your imagination to figure out the nuances of a specific implementation. It is not so important for us who reads this and whether they will block us at all - the account will become patient zero. Someone will go to our profile to view the information and start spreading the data collector virus. This script will be introduced in a wide wave, infecting acquaintances, acquaintances and just casual interlocutors one by one.
And so it happened. After a short time, the data of the desired user came to the command server, including the email of interest. Together with information about a good half of the rest of the members of the forum. It is a pity that it was not calculated through what handshake "the award found the hero", otherwise it would have been possible to write a study on social graphs on the forums along the way.
Let me remind you, by the way, that a similar scheme was recently used in FB, and for show, and not for collection purposes. Imagine how long can attackers exploit such vulnerabilities in a hidden way, collecting our data?
Conclusion
Here we have considered only five non-standard ways to get information that cannot be reached from open sources. There are always tricks and hacks, but keep in mind that someone could have gone this way before you: put the database on the Web, find a leak on the site, create a Telegram bot for verification. So I recommend compiling a list of the most useful sources of open information for yourself and going through it before you start digging deeper. For OSINT on the Runet, I recommend a bot with a comprehensive set of tools for all occasions - HowToFind bot.
And one more thing: do not forget to systematize the received data. Often, an elementary mindmap will allow you to see not the most obvious connections and cut off unnecessary ones. You may not need to use tricky tricks to find out what you want.