January 3, 2023

Is Web Scraping legal? Compliance and privacy

Legal Web Scraping

Currently, the world is discussing the uses of web scraping, its legality, and its limitations. This post explains why web scraping of publicly accessible data is regulated and the recent historic ruling in the United States (and other pronouncements) determining that Web Scraping is legal.

What do we understand by Web Scraping? It is the set of actions executed by software programs that extract large amounts of data from websites. Thanks to software it is possible to automate the obtaining of information and to do it faster, safer and in an error-free way. Its utility arises from the union of the extracted data, often from various sources, the subsequent analysis, and the conclusions. In fact, with web scraping, we added intelligence, differentiation, and value for our business strategy, marketing efforts, or the development of the latest product.

In different places of the world, there is a debate about web scraping usage by companies. Users are still worried about leaving their data in apps. However, there has been evidence that the correct use of the information collected on websites and apps can help provide quality services and improve the user experience. Millions of companies carry out web scraping around the world. Even specialists affirmed that up to 45% of all online traffic is moved by robots, not only by people.

There is currently a debate about whether Web Scraping is legal or not. The central issue that enables the practice is that the data must be freely available to third parties on the web: intellectual property rights must be observed and complied with (if they are protected data, they cannot be published anywhere). Otherwise, you will be incurring Spam or accessing third-party data for which you do not have consent for its storage or treatment.


Jurisprudence in the world favors Web Scraping

The landmark ruling by the 9th US Circuit Court of Appeals is the latest in a long-running legal battle launched by LinkedIn to prevent a rival company from extracting personal information from users’ public profiles. The case reached the US Supreme Court in 2021 but was sent back to the 9th Circuit for rehearing by the original appeals court.

In its second ruling, the Ninth Circuit reaffirmed its original decision, finding that extracting data that is publicly accessible on the Internet is not a violation of the Computer Fraud and Abuse Act, or CFAA, which governs what constitutes hacking under California law in the USA.

The decision is a huge victory for the entire community using tools to scrape massive data publicly accessible on the Internet.

Regarding other background information, the Australian Senate concluded that there is no evidence that the services provided using web scraping represent an additional risk, and sentenced Recommendation Number 22 of the Committee on Financial Technology and Technology Regulation of Australia, with which it rejected the idea proposed by bank representatives who sought to ban advanced web scraping.

In Europe, the issue is also debated. The PSD2 Standard established the obligation to implement APIs, and banks asked that web scraping be prohibited because it seemed unnecessary. However, after a discussion, it finally concluded that it is better to keep it as an option.

In general terms, we can say that, beyond the current controversies, Web Scraping is legal.