Recently, I had the pleasure of speaking with Batuhan Özyön, founder of Scrape.do, a powerful web scraping service that handles an impressive 500 million daily requests.

Batuhan shared fascinating insights about his entrepreneurial journey, the technical obstacles Scrape.do overcomes, and his vision for the future in the age of AI.

Key Points from the Transcript:

  • Founder Background:
    • Batuhan Özyön started his journey at a young age by building simple websites.
    • Got into reverse engineering and bot creation through gaming, particularly in “Knight Online.”
    • Studied Computer Engineering and gained professional experience before founding Scrape.do.
  • About Scrape.do:
    • Manages approximately 500 million daily web requests with nearly 100% success rate.
    • Initially started as a personal tool to bypass scraping blocks encountered during freelance projects.
    • Evolved into a robust platform offering reliable data extraction by simulating real users.
    • Legally and ethically provides public web data.
  • Technical Details:
    • Features include a proxy pool of 110 million residential IPs worldwide.
    • Bypasses firewall and blocking mechanisms (like Cloudflare).
    • Offers browser automation with advanced fingerprinting techniques.
    • Optimized for cost-effective and reliable scraping solutions.
  • Team and Growth:
    • Currently has an 11-person team, with 70% focused on technology.
    • Works with around 600 global enterprise clients.
    • Exceptional customer support is a key differentiator, provided by technical engineers available 24/7.
  • Market and Challenges:
    • Scraping is crucial for market intelligence, price monitoring, and data-driven business strategies.
    • Companies commonly attempt to block scraping activities; Scrape.do addresses this through ongoing technical innovation.
  • Future Developments:
    • Planning to launch additional products: direct proxy service, browser network for automation, and ready-to-use datasets.

Video Full Transcription

Okay, here's the English translation of the transcript:

Batuhan Özyön:
While playing a game, thinking about how we could cheat in it, how we could play it with bots, we've now reached this point. We're still dealing with reverse engineering. We're dealing with its reverse and its methods. And almost 500 million [requests of] traffic passes through us daily.

Batuhan Özyön:
And this traffic has a success rate close to one hundred percent. Yes, that's a very serious number. The crucial point. Actually, the part I enjoy is the technical side of it.

Emre Elbeyoglu:
Nice, this part is going to get longer then.

Emre Elbeyoglu:
In that case.

Batuhan Özyön:
Let me also give a figure: 17 million dollars. Good investment.

Emre Elbeyoglu:
This is a bit more difficult. Right now, something about that number didn't make much sense to me either.

Batuhan Özyön:
30% is a lot, though. Am I starting to think about our own ad campaigns now? Because it's very possible in our sector.

Emre Elbeyoglu:
Hello everyone. My guest today is Batuhan Özgün. I've known him for over 5 years. Today, we'll be listening to his startup and growth story together. Welcome, Batuhan.

Emre Elbeyoglu:
How are you, man?

Batuhan Özyön:
Welcome, man. I'm good. Thank you. How are you? What's new?

Emre Elbeyoglu:
How am I? It's been a long time. It's been 2 months since I continued this video series. So I'm happy.

Emre Elbeyoglu:
And I'm hosting someone I like. So I hope it will be enjoyable.

Batuhan Özyön:
Thanks. I hope it will be enjoyable for me too. I'm following. I hope the seeds of a new channel that will grow much bigger have been sown here.

Emre Elbeyoglu:
Our aim here is to be beneficial to people and listeners. So, I think we can start briefly. Who is Batuhan Özgün, what has he done so far, what does his startup ScrapeDoo do? We will proceed in this order. Then, what did he do to grow?

Emre Elbeyoglu:
The concept of the channel is about going back. We'll actually be talking about these. So let's start by getting to know you, man.

Batuhan Özyön:
I'm from Ankara, born and raised, so to speak. Or rather, I've been in Ankara for as long as I can remember. Even though our origins are different. I got acquainted with software and information technologies very early. I started building simple websites perhaps when I was 11-12 years old, all these processes.

Batuhan Özyön:
But what changed my life was actually a game. While playing a game, thinking about how we could cheat in it, how we could somehow play it with bots, we've now reached this point. We're still dealing with reverse engineering. We're dealing with its reverse, its methods. The game's name is Knight, right?

Emre Elbeyoglu:
Knight Online. There's probably no one in our age group who hasn't played it. You're younger, of course, but I knew it back then too, was its name Co-Ex?

Batuhan Özyön:
That's what they called it. But it was different. I mean, it had many names. Let's not go too deep into it, the game is still live.

Emre Elbeyoglu:
By the way, I also used it. Since this is going to be a Turkish channel, it's going to go to those platforms. I don't think so, I mean. You've probably tried this for educational purposes anyway. Anyway, let's close this topic.

Batuhan Özyön:
Let's close this. Yes, man, thanks. Not just any game business, but in a general sense. Actually, after I discovered that I could program things technologically on a computer, I started to be able to do everything. This could be writing an automatic bot for a game, or making a webpage.

Batuhan Özyön:
My parents are in commerce, so I made an automation for them. In the end, I studied computer engineering. After computer engineering, we entered the sector. We continued with software. The first company I joined was a startup.

Batuhan Özyön:
In 2018. Afterwards, while thinking about what I could do on the side, the seeds of ScrapeDoo started to be sown around 2019-2020.

Emre Elbeyoglu:
It's been a while. I didn't think it was that long, of course. Time flies.

Batuhan Özyön:
We met around that time too. 2018-2019 or so.

Emre Elbeyoglu:
In your previous experience, I believe you were doing something related to this. You don't have to mention the company name. If you want, we can continue from there. Then you can explain what ScrapeDoo does.

Batuhan Özyön:
Sure, actually I was always doing scraping projects, big and small. A bit as a freelancer or at the company I worked for, I was doing work focused on scraping. You know how sometimes, unnecessary know-how builds up and you have to go very deep into a job and start proving yourself. This is actually at the very core of ScrapeDoo. While scraping a lot of sites myself, I discovered that I actually needed an internal tool.

Batuhan Özyön:
Because, of course, as you scrape target sites, they block you. And as long as they block you, you need to come up with a solution. And this technology is constantly evolving. Blocking technology is evolving. Consequently, scraping technology also needs to evolve as a counter-attack.

Emre Elbeyoglu:
And you created a startup that steps in right at this point.

Batuhan Özyön:
Yes, exactly.

Emre Elbeyoglu:
Now, it's useful to explain for those who don't know at all. Scraping is actually defined as "kazıma" in Turkish, but let me describe it as data collection. When you enter a site, say Amazon or a different site, it's about tracking prices there, or tracking prices or different information on a real estate site. This can also be on LinkedIn.

Emre Elbeyoglu:
The process of tracking any data is actually called scraping. But while people do this, the owners of these sites also develop various technologies to prevent it. ScrapeDoo, by developing different solutions that bypass these, ensures that this data flows smoothly. Did I define it correctly?

Batuhan Özyön:
You defined it perfectly, man. On our side, the definition is a bit like this. Let me also briefly explain it in just 1-2 sentences. When you send your bot to the target site, we take it. We upgrade it to a real user for the site.

Batuhan Özyön:
And we present it to the target site as if it were a real user. Consequently, no one can block anyone when this happens. But we are only in the public part of this business. That is, we enable the scraping of data that is inaccessible on the web, publicly blocked data.

Emre Elbeyoglu:
It's not actually an illegal thing either.

Batuhan Özyön:
Yes, that's the first question people might ask.

Emre Elbeyoglu:
Public data, by the way, as we mentioned LinkedIn earlier, there was the HiQ case, etc. If data is public, it is accessible to everyone, and using it does not constitute a crime. As far as I know, many products, these so-called "intelligence" data products, are all based on crawling and scraping web data like this and making it meaningful. Of course, the public web is not the only source of data. Therefore, this is neither illegal nor is it just an important channel used by large companies.

Emre Elbeyoglu:
Because in our age, we will come to artificial intelligence and AI. The most indispensable thing right now is actually collecting data correctly and making sense of it. We can say that ScrapeDoo plays a good role in this sense.

Batuhan Özyön:
Yes, there's also this situation here. Actually, all the applications we encounter in our daily lives have technologies that do scraping in the background. For example, applications used to get a better hotel room price.

Emre Elbeyoglu:
"Get the best price" etc., which we all hear in ads. It seems there's a lot behind this "getting the best."

Batuhan Özyön:
Yes, yes, this is just the simplest example of us being efficient.

Emre Elbeyoglu:
So, in short, ScrapeDoo develops a technology that makes a normal bot request look like a real user with a more meaningful fingerprint, enabling it to retrieve this data. To explain it a bit more technically. Because here, those who want to collect data, i.e., those who scrape, are up against brands that, as I mentioned earlier, have their own teams to prevent bots from getting this public data. What different solutions can we create by bypassing this? At this point.

Emre Elbeyoglu:
When you work with ScrapeDoo, there are no problems whatsoever. We can say that a data flow is ensured in the background with your efforts. Super, man, I think we've defined what we do well. We also talked a bit about the problems people face. I think you can talk a bit about the team or competitors.

Emre Elbeyoglu:
Let me not say competitors. I was going to say numbers. "Competitor" slipped out. By the way, even though there are many competitors globally, I don't think there are any in Turkey, or developers are struggling with this on their own, but I don't think they need to struggle, by the way, let's give credit and do some advertising. Let's talk about your current team and general numbers.

Emre Elbeyoglu:
If there's anything you can share about how many requests pass through us. If you briefly talk about these, it would be super.

Batuhan Özyön:
Of course, of course, that would be super. I also love talking about numbers. Especially since I'm still at the helm of the technology side. This means that all target sites, all firewalls, in the end, all accessible websites, all arguments that can be accessed, are processes where I'm involved, with my fingers on the pulse. We currently have a team of 11 people.

Batuhan Özyön:
70% of this is the technical team, and the other part is our team where our friends who can develop growth-oriented strategies are located. On the technical side, we have friends who work entirely on reverse engineering, focused on this. And in terms of numbers, I can say this: We currently work with nearly 600 companies worldwide. And almost 500 million [requests of] traffic passes through us daily. And this traffic has a success rate close to one hundred percent.

Batuhan Özyön:
Yes, it's a very serious number. And when you see the capacity, the backend, as a technology, a very satisfying picture emerges. As much as we'd like to share it, we can't.

Emre Elbeyoglu:
Actually, the numbers show this. Also, the fact that the technical team is in the majority, I think this is already a technical job and your direct target audience, as we understand, are developers dealing with this. Similarly, I want to mention a feature of ScrapeDoo that I think stands out. Support.

Emre Elbeyoglu:
Now, of course, the purpose of this video is not to directly praise ScrapeDoo, etc. It shouldn't be understood that way. Because I have evaluated some things objectively here. Support is a really important resource in terms of technical need. And for support, they usually put support specialists who don't understand this business, etc.

Emre Elbeyoglu:
So, dedicating a technical person to support in this sense is, I think, very valuable. And I haven't seen many companies that do this. I mean, both a technical person and an engineer, this is an engineer. And also being able to get a response within a few minutes when I write something, in a quality that I can call 7/24. This is very important, especially in global SaaS.

Emre Elbeyoglu:
I've only seen this in ScrapeDoo, and I saw it in DataforSEO. They are also very good. And I saw it in PostHog. These three companies, in terms of support, whenever I write, there's a real human who responds in some way. I find it very strange.

Emre Elbeyoglu:
But of course, it's also a very good thing. You must have had feedback from your customers on this matter. How much importance do you give to this issue? I wanted to hear a bit from you.

Batuhan Özyön:
This is a very comprehensive topic on our side. It's already at the core of our entire growth strategy. We'll get to that, of course, but. We always have priorities in support. Our first priority is this.

Batuhan Özyön:
We have a very serious ability to empathize, where we can put ourselves in the customer's shoes. This means, our customer is scraping a website and has developed a business model on it. The moment the data flow is cut, the chaos they will experience with their customers or the chaos their technical team will experience to solve it. It's a very stressful job. In the end, they don't just subscribe to ScrapeDoo and build things on the service they get from there.

Batuhan Özyön:
ScrapeDoo's team creates the impression of sitting at the same table with the companies, with the company's technical team. This is where the business starts to differentiate. Because, for example, other companies that we can call equivalents of ScrapeDoo or that people compare us to are predominantly proxy companies, and they only do IP leasing. But after leasing the IP, they don't support any of the remaining processes. Anyone can change an IP address today.

Batuhan Özyön:
The work we do on top of that is what makes us different from others. At this point, with our empathy, while the customer is still asking us that question, we evaluate everything about how they are experiencing that problem and take action accordingly. And since we can take action very quickly at this point, because more or less everyone starts scraping the same domains after a while, perhaps within 1 minute or 5 minutes, the support issue is resolved, meaning we close the incoming ticket directly.

Emre Elbeyoglu:
Very good. This actually significantly nurtures the sense of trust for the person who will work with you. Because as you said, instead of experiencing that chaos, I want to entrust this scraping or proxy solution I'll work with, eyes closed, and let the work proceed. There's actually a solution there that feeds very important data. Therefore, I think they get that trust from you through support, and that's how your customer stays.

Emre Elbeyoglu:
Likewise, your churn rates, apart from certain key customers, must be very low. Or you can talk about how many of these 600 customers are enterprise and how many are developing and selling on their own.

Batuhan Özyön:
We currently serve over 80% enterprise customers. That 80% threshold is 0% for us, meaning it's our safe zone where there's no churn. Our churn rate in the remaining 20% is 20%, by the way. It's very low, and that part fluctuates every month. In the enterprise segment, it's more about volume variability for us.

Batuhan Özyön:
A company that scrapes, say, 500 million pages one month might scrape 100 million the next month, or 1 billion. Adapting to this rapid dynamic is also a problem here. But we are quite good at this point both in terms of pricing and performance. And we are improving it further.

Emre Elbeyoglu:
Well, super. I mean, the fact that there's not just technology but also a technically good team behind it is, I think, one of the reasons customers prefer you. Likewise, we at LiveChat AI are currently in talks with ScrapeDoo. Even if Skype S doesn't have a lot of protection, it blocks a lot or can cause problems. So, we can say it's a good solution that's been developed. Even before we get to the growth topic, if you want to explain the product in a bit more detail, it might be good, because the listeners could be developers, technical people, and so on.

Emre Elbeyoglu:
So it might make more sense to them. Because this isn't a standard scraping API. It has a proxy pool that you manage. You can send requests from any country you want. Or you can render the browser and provide this, cookie setting.

Emre Elbeyoglu:
There are many features like this, actually. If you could briefly talk about the distinctive features that make you, you, then let's move on to Growth.

Batuhan Özyön:
The crucial point, by the way, the part I enjoy is the technical side.

Emre Elbeyoglu:
Nice, this part is going to get longer then.

Emre Elbeyoglu:
In that case.

Batuhan Özyön:
Actually, the technology within ScrapeDoo is something we release hundreds of versions of, perhaps daily, every moment, and it's constantly being developed. We currently have 110 million residential proxies worldwide. This means we can obtain IP addresses from any country, any city, and if we push it, even at the district level. These IP addresses are from real devices. Meaning, IP addresses from the phone in my pocket or the computer in my home.

Batuhan Özyön:
These are, of course, IP addresses collected from ethical sources. Absolutely.

Emre Elbeyoglu:
They provide these in a perfectly legal way. By the way, it would be super if you could briefly define "proxy" in parentheses. We're explaining it for those who know, but let's not skip defining proxy.

Batuhan Özyön:
At the end of the day, a proxy is actually a separate layer that sits between the communication of two computers, and we can configure this layer in different locations, on different devices. For example, when I'm accessing the outside world from my office in Ankara, Turkey, and I want to choose a location in Australia or London, England, we can think of it like a VPN, but the protocol, the technology is different. In non-technical terms, I can summarize it as being able to go out with a different IP address. The most fundamental features that distinguish us are actually the 110 million residential, mobile, and datacenter proxies we have, and then the truly value-added nature of the service we provide.

Batuhan Özyön:
For example, let's talk about LiveChat AI. It wants to scrape a website, and this website uses Cloudflare or different firewall services. When it sends a request with ScrapeDoo, ScrapeDoo recognizes the firewall and automatically applies the features you need to use in the background, allowing you to scrape without any blocking. There's a situation here. For example, just for Cloudflare, there are perhaps more than ten arguments we can use to send requests without a bot in the background.

Batuhan Özyön:
These are solutions that, in total, solve all firewall systems at a level of 50-60 [different types], technically ready, on-ready, waiting for your request, and we approach the entire process cost-effectively here. For example, while normally you would need to send a browser request, to access a website with a browser, ScrapeDoo can access that website with a simple HTTP request and get the data inside, if necessary, much faster by rendering it with a browser.

Emre Elbeyoglu:
You get it much faster this way.

Batuhan Özyön:
And since we have thousands of browser farms in the background, maintaining or scaling them is not in the customer's hands at all, nor do they need to. They just send a request and get a response.

Emre Elbeyoglu:
So, you gave Cloudflare as an example. When Cloudflare makes an update, you also have to release a fix that can somehow overcome this again, and Cloudflare is just one of them. Yes.

Batuhan Özyön:
Absolutely. For this...

Emre Elbeyoglu:
There are much more serious services. That's why web scraping isn't as easy as they show on YouTube, like "scraping with AI is very easy," etc. I think we talked about this recently, AI, scraping, is probably the last thing that can overcome these obstacles, let me say that for now. Maybe in the coming years, it might. Therefore, I already know that a lot of different things are being done in the background, but the audience shouldn't be fooled into thinking it's that easy.

Batuhan Özyön:
Yes. I mean, sleepless nights. Very little sleep and working under intense stress. That's an undeniable reality, unfortunately. But enjoying it is very important.

Batuhan Özyön:
Because it's not a job that can be done without enjoying it. We've forgotten how to write normal software. Just from dealing with reverse engineering.

Emre Elbeyoglu:
Are there any other prominent features you can talk about? Or we can slowly move on to growth and growth strategies.

Batuhan Özyön:
If I get too technical, I can probably talk until morning. So...

Emre Elbeyoglu:
Then let's leave it here, and if there are more detailed technical questions, they can reach you and your team at scrape.do and get information. If there's such a need, you'll be talking anyway, let me close this topic. Since the day we met, we've talked about many things. You've also progressed with a good strategy.

Emre Elbeyoglu:
You've reached 600 customers now. What has been your growth strategy so far? Because doing the job is difficult, and it really requires engineering and a mindset, but being able to sell it is a different thing. Because people need to trust you, you need to explain its outputs, you need to reach them, and most of these customers are global, gaining their trust is much harder. So, what did you do to grow from the beginning, especially in the beginning, which is very difficult? Can you briefly explain?

Batuhan Özyön:
Sure. At first, this...

Emre Elbeyoglu:
This is the hardest question.

Batuhan Özyön:
Yes, it's difficult, but it's actually a topic where I was lucky. A topic where I've always considered myself lucky. The first customer who uses the service, the very first one, is very important. For every startup, for every venture, for every project. The company that first tried our project, the first to try ScrapeDoo, was very large.

Batuhan Özyön:
And it was very easy for us to see its mistakes and position it correctly in the market. When I first started ScrapeDoo as a side project, I was still using it myself. But after a while, the number of friends around who knew the know-how here started to increase, which was around the time we met. Word of mouth, actually, a very large company heard that I had made such a tool. And they wanted to try it.

Batuhan Özyön:
They tried it and bought it within a week. And I was shocked. And the volume was very high. Maybe 2-3 times the total scraping operation I had done before came in all at once. And it started to grow slowly.

Batuhan Özyön:
Every month, 5%, 3%.

Emre Elbeyoglu:
So, you started the game with a good MRR from day one.

Batuhan Özyön:
Yes, I started with a good MRR.

Emre Elbeyoglu:
You started with enough money to support yourself. Nice.

Batuhan Özyön:
Right when I was getting married, in fact, I quit my job afterwards. And with the strength that quitting gave me, after the quality of service I provided there, I focused only on the job and then spent time only on improving the service.

Emre Elbeyoglu:
During this time...

Batuhan Özyön:
Word of mouth continued. The quality of the service I provided, and in the end, without any problems, truly as a one-man giant team, I started to grow ScrapeDoo by making it look like a team of 10-20 people.

Emre Elbeyoglu:
So, as far as I know, you were a solo founder for the first 2 years, right? You progressed a lot as a solo founder, and then you grew the team, right? We skipped that part when talking about the team. That's also important.

Emre Elbeyoglu:
It's not easy to create an A-to-Z solution from a website, doing everything. It means that at that moment in the market, what you called "lucky," the market situation at that time, meant that people were having a problem with this.

Batuhan Özyön:
Definitely.

Emre Elbeyoglu:
And your solution is kind of like, it's not quite right to compare, but just like OpenAI's APIs were used like crazy, then DeepSeek came out, which is open source and about 50 times cheaper, or 30 times. They even made the API the same. You just change "chat." On one hand, if the quality isn't there, you can change it. Can your solution be understood as easy for a developer to switch to, as long as it's problem-free?

Emre Elbeyoglu:
Because migration is a bit easier, switching from something else can also be easy. For example, LinkedIn is a platform that is scraped a lot. Things like job postings, contact information, etc. B.K.

Emre Elbeyoglu:
LinkedIn has recently limited this a bit more, closing down things like former companies worked at. It's slowly closing down a bit more. It's always trying to steer things towards the post-login side, meaning the part that can't be scraped, or can't be scraped legally. What do you think about this? I mean, with AI, I think this scraping will increase much more because agents are using it.

Emre Elbeyoglu:
Maybe it will go through APIs, but do you think the web will become more closed in the future? Or where is this heading with AI and agents?

Batuhan Özyön:
Let me start with LinkedIn. Regarding hiding public data, they are entirely commercial. They have valid reasons too, I think; they provide very well-segmented human data. But if we ask how much this can be prevented, with ScrapeDoo, we provide this publicly, but once it falls behind a login, we stop providing it. But this doesn't mean it can't be scraped.

Batuhan Özyön:
It starts to fall into the grey area. Meaning, individual people start doing this. They accept crypto payments, sell the dataset. Either the data they scrape, or companies or small firms emerge that offer this like a small SaaS, a micro-SaaS, in a closed Telegram group.

Emre Elbeyoglu:
Isn't it much easier to block things on a closed web? Because there's a user inside, and you know what they're doing. Otherwise, there are thousands of people with different fingerprints that you don't know, but getting thousands of people with different fingerprints to sign up can be limited, can be blocked. It's not as easy as public, I think, or it seems more blockable.

Batuhan Özyön:
What we see on TikTok, Instagram, these farms, computers, screens, I mean, there are friends who look like broadcasters with lots of mobile phones, let's say. Doing bot work. These are actually doing what we describe as grey-area work. Compared to what we do, it can't be blocked. It's unblockable.

Batuhan Özyön:
At the end of the day, there's a physical device, and even if it's taking human actions, this applies to computers too. It converts to machine code in the background. When it automates this, it only leaves browsing traces behind. Like it went to the homepage, went to the detail page.

Emre Elbeyoglu:
It can make membership difficult, etc. Even if it looks like a real user, say an email or SMS verification, all of these can still be overcome. But in the end, for example, even a user on LinkedIn has limits. We did a lot of that during our agency days. They can visit this many pages, etc., they can block that in the background. By the way, opening a parenthesis, we said there are these farms, they put lots of phones, even on walls, we see Instagrams, etc., and there are phones acting like real users.

Emre Elbeyoglu:
Actually, ScrapeDoo's proxy pool creates profiles with different fingerprints like this in the background without needing devices. But because they can't do it, they use this method. Correct? Because they might understand these two as different things. Actually, for those who can do it, it's the same thing.

Batuhan Özyön:
Yes, I mean, I completely agree with you. Since we are on the side that can do this, it's very easy for us to talk about it. But since we evaluate this with data that can be pulled publicly, our job just becomes an argument that companies that need data or are fed by data can use. I describe it like this. I always say this here. We sell lockpicks. If a locksmith uses it, it serves the homeowner, but if a thief uses it, it robs the homeowner.

Batuhan Özyön:
It's exactly like that.

Emre Elbeyoglu:
The same things apply. Okay, I had asked about scraping on the closed web. With AI, how do you think this scraping thing will continue? Because, especially since AI is dominant, I see a lot of startups that manipulate the browser to make it look like a real user, just using AI to, say, book on Airbnb. I see these are in vogue, getting a lot of investment and stuff. Where do you think this is going?

Batuhan Özyön:
Let me also give a figure: 17 million dollars, good investment.

Emre Elbeyoglu:
Which company was that?

Batuhan Özyön:
Was it Browser AI?

Emre Elbeyoglu:
Browser AI, the open-source one?

Batuhan Özyön:
It received 17 million dollars, I think. Yes, that one.

Emre Elbeyoglu:
It's a company in Silicon Valley, I'll write in the comments later which company it was. There are already many companies. How do you see it?

Batuhan Özyön:
I see it this way. For example, services like Browser AI that we just mentioned, no matter how much they focus on browser actions that can be done with AI, they are services doomed to be blocked by firewalls in the end. So, teams that don't have strong reverse engineering muscles will unfortunately not be able to achieve anything. They won't even be able to click through a simple captcha verification in those browsers. At least for now.

Batuhan Özyön:
I think they can solve it in the long run. But these are topics that require serious reverse engineering knowledge. These companies need to build their own browser systems. The open-source Chromium resources they use, etc., are not capable of solving these.

Batuhan Özyön:
Especially with things like MCP, for example, they show a test page, and some friend wrote, they order from Migros, and their orders arrive at their home. These are simple things because Migros is not a company against taking orders via bots, nor will it be, because autonomous systems will make more sales.

Emre Elbeyoglu:
It won't leave the door wide open at some point either, perhaps.

Batuhan Özyön:
At that point, the advantage of artificial intelligence could be this. In processes that require action, like, say, log in, go there, go to that page, do this, it seems to me that it will enable people without programming knowledge to develop more RPA-like applications. But arguments that will bypass bot protection, that will display the page, are not among the currently developing artificial intelligence technologies.

Emre Elbeyoglu:
Maybe a protocol or something like allowing e-agent IP addresses will emerge. Could something like this be possible for you?

Batuhan Özyön:
They are already doing it. Currently, the biggest scraping is done by Google. Then OpenAI.

Emre Elbeyoglu:
They are allowed, but for example, as an e-agent developer, I go to a company and say, "Add me to your list. I won't do anything illegal. Just don't block me, my agents."

Batuhan Özyön:
They can say that. If the company wants, we can do it.

Emre Elbeyoglu:
This is a bit more difficult. What I just said didn't make much sense to me.

Emre Elbeyoglu:
Good. So, in the future, with AI agents, its use will increase, but it means that if someone can do it, they can overcome these challenges. But you said this is not an easy thing.

Batuhan Özyön:
Not for now. There are two methods. Or rather, there are two different parallel lines of business here. One is, for any website, there's an accessible website like a demo example, and an automation wants to be written there. AI will do great things there.

Batuhan Özyön:
Our experiences there, our lives will change. I deeply believe this. But on the other hand, if we need to make a website perform an action that requires intervention, and when we need to access that website, the website owner, the company, doesn't want to allow it, AI is helpless here right now. On the contrary, it's very talented at blocking. It can understand that it's not human.

Emre Elbeyoglu:
Right now, people are already in a dilemma. When ChatGPT first came out, everyone was up in arms saying, "It used our data," etc. But the train has already left the station. It's too late now. Afterwards, they immediately blocked ChatGPT's bot.

Emre Elbeyoglu:
So it wouldn't enter. Then, when ChatGPT started sending traffic to them, some started to open up again. So, I think this will be like the process with Google because, in the end, trying to shut down your site is also damaging your own potential. And just like ChatGPT can now crawl a site with ease when you ask a question and it visits all of them, I think it will continue in the same way, but let's see what the future holds.

Batuhan Özyön:
Yes, if I'm not mistaken, it's not a topic I'm 100% sure about, but I wanted to share. Compared to Google's search engines, the request volume of OpenAI bots has increased about 5 times more than Google's, and they all access websites so simultaneously. Servers, like CPUs, are maxing out. Databases can't keep up, a situation has started to emerge. Unfortunately, some companies are experiencing serious problems because of this, let me say.

Emre Elbeyoglu:
These still seem like solvable problems.

Batuhan Özyön:
Let them send traffic.

Emre Elbeyoglu:
I'm also very curious about this, especially the bot side, or how this cannot be prevented. The answer always seemed to me, before I met you, that they must be blocking it somehow, that there must be a margin of error, of course, say 1%, 2%, every company has some margin, but this is ultimately a billion-dollar business. For example, Google's fight against fraud is also very tough. They are also trying to block non-real users in the same way. So, if we look at this solution from a different perspective, people who misuse this, just as you scrape, they do it to appear as a valid click in advertising. They are actually spending a crazy amount of money on this, doing something. Google, for example, how many thousands of engineers does it have for this?

Emre Elbeyoglu:
I always thought they must be allocating serious resources for this. But in the end, it was in those fraud reports. The last one I read said 30% was fraud. Totally insane. Does it mean that after a certain point, they accept 30% of the total market?

Emre Elbeyoglu:
Or how much resource do you think they allocate to this, and how much do they stop it? How much can they stop it, do you think?

Batuhan Özyön:
Technically, there's unfortunately a limit to how much they can stop it. I mean, at the end of the day, as I said before, different computers are actually communicating with each other through human hands. We, as humans, are not communicating. Humans are using computers there; computers are tools. And the traffic in between is entirely digital.

Batuhan Özyön:
Now, since we are sending our own digital copies to Google's servers, when they evaluate this, they can only do one thing. After a while. Of course, I'm talking about the most professional access. Let me talk about it from our perspective of access. First, they start looking to see if the sessions have a trace.

Batuhan Özyön:
And these click reports, etc., usually come out a day later on Google, or with a delay of certain hours. This is the reason. Does the profile of the person who clicked the ad have a history, does it have a score in the browser, from the request it sent to its cookies, to the websites it visited? Google's Chromium project is fundamentally based on such things, and to understand more, to understand people, to segment them. So, when they make sense of all this and look at it, they do a scoring, and based on this scoring, they can actually understand whether it's a bot or not. If you ask if they can understand 100%?

Batuhan Özyön:
No. I don't think they can. But 30% is a lot. Am I starting to think about our own ad campaigns now? Because it's very possible in our sector.

Emre Elbeyoglu:
Right. Yeah, I remember things from the Flatart days too. Especially... you know, advertisers like carpet cleaners, dry cleaners, and so on.

Emre Elbeyoglu:
It's extremely reliant on Google now, in our age. Because people don't look for the neighborhood carpet cleaner anymore; if something spills on the carpet or there's an issue, they call directly. For example, when they worked with them, everyone was clicking on each other's ads. To exhaust their budget. And real users are involved there, so Google can't do anything about it, and in the end, it somehow depletes the person's budget, friends and family, etc. In the end, there's a side to this that really can't be prevented, those who benefit themselves through bots, but on the other hand, there are also malicious people. For example, I never click on an ad to go to a website I know.

Emre Elbeyoglu:
Even if I type its own branding. It's like it's become rude. Therefore, the fact that those numbers are 20-30%, to me now, if you ask me, seems a bit more normal. But Google also has an advantage: the guy sells keywords for even $50 per click, he sells clicks. In such a world, it seems the guy accepts these losses, like a margin of error. Like you open a cafe, and there's a 3-5% margin you can't account for, you categorize it as a loss. Maybe Google has an accepted rate there.

Emre Elbeyoglu:
"This much of our team will work on solving this. But this is an accepted rate for us," they might have said, I think now, because there's very little they can really prevent. Should they check if it's a new user? Then, should we not count new users? That leads to another contradictory thing. But of course, this video shouldn't end up with a title like "How to bypass Googlebot."

Batuhan Özyön:
We sell SERP. It's available on SERP APIs. Let me not be misunderstood. Let me say it again here.

Emre Elbeyoglu:
Super, man, it was a really enjoyable chat. Is there anything you'd like to add apart from what we've talked about? For example, I know from our previous conversations that you have new products and services planned. I think you can briefly talk about them, and then we can slowly wrap up.

Batuhan Özyön:
Of course, with pleasure. Currently, we only provide an API service at ScrapeDoo. But with the resources we have, a direct proxy source, a browser network that we can connect to via MCP and manage that browser, and this will be able to proceed without being blocked. And in these kinds of automation tools, AI developers, as we can call them, have become very popular on an individual level. Everyone has become capable of writing code.

Batuhan Özyön:
Even if not 100%. In the end, it's very difficult for them to solve the challenges they face while scraping with a browser or otherwise. At this point, to cater to everyone, to everyone we can call a developer, we will start offering new solutions. We plan to provide directly scraped data soon. 4-5 different products are actually being developed in parallel right now.

Batuhan Özyön:
We plan to offer these first to the companies we work with. Because these will be somewhat high value-added products.

Emre Elbeyoglu:
And...

Batuhan Özyön:
We want to meet their demands first. Then we plan to open these up to the entire sector.

Emre Elbeyoglu:
I see. So, to name them, a solution where you sell ready-made datasets, a solution where you sell proxy services, and if I remember correctly, you will also provide a browser service. Meaning, a scraping browser, just like our friend who received investment, you will also develop a solution that allows any AI agent to be connected and perform an operation smoothly.

Batuhan Özyön:
And at the end of the day, we approach all of this like this. If it interests the technical friends who will listen to this further, since we even develop our own browser binary, we will have all the arguments internally to collect data without being blocked by a website. Of course, there's this situation. Currently, for example, when we catch a browser with MSP to open a website, or when we actually scrape with AI, the costs are 10-20 times higher. It's impossible for the end-user service to directly identify ways to optimize these.

Batuhan Özyön:
We can manage these internally and want to offer them a structure that can provide a more suitable, stable, and cost-effective solution. I hope we will be successful; it's almost there. Hopefully.

Emre Elbeyoglu:
Well, those are good news. I think all of them are solutions that can be the fuel for artificial intelligence right now. Because what e-agents use... Everyone is making e-agents. Actually, we can say you are one of the people selling shovels in the gold rush era.

Emre Elbeyoglu:
So, we'll see if you'll be the one earning the most in the coming years. We'll see. Thank you again, man, for your time. I hope it was beneficial for the viewers. Likewise, I really want you to be known more.

Emre Elbeyoglu:
Because, as far as I know, there's no one doing this in Turkey. Developers might be doing it or challenging it. So, I definitely think it's beneficial for both sides, the audience and you. I hope it will be so.

Batuhan Özyön:
Hopefully. Thank you too. It was very enjoyable. By the way, this was my first such speaking experience. Let's break the ice.

The transcription provided by youtube-video-transcript.com

The link has been copied!