Criminal Liability for Web Crawling: Complete Analysis of Supreme Court Case 2021Do1533






Criminal Liability for Web Crawling: Complete Analysis of Supreme Court Case 2021Do1533


In the modern digital economy, as big data and machine learning technologies develop rapidly, the use of ‘web crawling’ techniques for automatically collecting large volumes of information from the internet has grown explosively. From portal searches to e-commerce price monitoring and financial investment information analysis, services that would be unimaginable without crawling technology surround our daily lives. However, behind these technological innovations lie complex legal issues. There was a particular need for clear standards on when and to what extent crawling competitors’ core data is legally permissible.

Against this backdrop, the Supreme Court issued a significant ruling (Supreme Court Decision 2021Do1533, decided May 12, 2022). This is evaluated as a landmark precedent that first clarified the possibility of criminal punishment in data crawling cases between accommodation industry platforms.

1. Case Background: Competitor Data Crawling Dispute

This case involves a criminal trial where employees of Company A (defendants), which operates accommodation information and reservation brokerage services, were indicted for allegedly obtaining data through automated methods without authorization from the mobile application (‘Direct Booking’) and website operated by Company B (victim company), a competitor in the same industry.

Key Points of the Indictment:

  • Information Acquisition Process: The defendants used packet analysis tools to decode the victim company’s app program source code, API server connection information (module structure, URL addresses, command syntax, etc.).
  • Circumventive Access: Based on the decoded information, they disguised themselves as regular users and directly accessed the victim company’s API server from PC environments.
  • Crawling System Development and Operation: They developed and used a crawling system that batch-queried all accommodation facility information within a specific area (1000km radius from the defendant company’s center). (Normally, the app only provides information within 7-30km of the user’s location)
  • Mass Data Collection: Over approximately 4 months (June 1 – October 3, 2016), they operated the crawling system 1-2 times daily, accessing the API server and unauthorized copying of information such as partner accommodation names, locations, and room names.

Applied Legal Charges:

  • Violation of the Act on Promotion of Information and Communications Network Utilization and Information Protection (unauthorized intrusion into information and communication networks)
  • Copyright Act violation (infringement of database producer rights)
  • Criminal Code violation of computer obstruction of business

Stark Contrast Between First and Second Instance Judgments:

First Instance Court (Guilty Verdict):

  • (Regarding information network intrusion) The court determined it as intrusion based on the victim company’s private management of API information, explicit prohibition of automatic connection program use in terms of service, and the defendants’ continued access by changing IPs despite IP blocking measures.
  • (Regarding copyright violation) The court recognized systematic and organized unauthorized replication of the database over 6 months with 264 instances.
  • (Regarding obstruction of business) The court judged that requesting nationwide information beyond the app’s normal search range caused mass calls and obstructed business.

Appeals Court (Not Guilty Verdict):

  • (Regarding information network intrusion) The court found not guilty based on: API access requiring no membership registration or password, packet analysis being common technology, absence of technical measures to hide API server URLs or block access, collected information being public and search range expansion alone insufficient to constitute access authority violation, and terms of service applying only to members.
  • (Regarding copyright violation) The court determined that collected information items (3-8) did not constitute a ‘substantial part’ compared to total items (about 50), content was mostly public information obtainable through the app.
  • (Regarding obstruction of business) The court viewed that API servers are designed to return information according to commands, so requesting information within allowed command syntax ranges is not ‘false information or improper command input,’ and server failure dates could overlap with typical usage increase periods.

With first and second instance judgments being diametrically opposed, industry attention focused on the Supreme Court’s final judgment on criminal liability for web crawling.

2. Web Crawling Technology and Related Legal Issues

To understand the core of the Supreme Court ruling, it’s necessary first to accurately grasp the essence of web crawling technology and the legal issues surrounding it.

Technical Definition of Web Crawling:

Web crawling is a process of systematically exploring and collecting webpage data on the World Wide Web through automated software called ‘crawlers.’ Starting from designated web addresses (URLs), it sequentially tracks hyperlinks within pages while downloading and storing data in a chain reaction.

  • Distinction from Web Scraping: While scraping focuses on extracting and processing specific data from webpage screens, crawling emphasizes collecting webpages themselves and gathering extensive data by following links. However, in practice, they are often used interchangeably for data collection purposes.
  • Clear Distinction from Hacking: Crawling basically involves accessing ‘public’ servers on the internet to collect information, so it’s fundamentally different from hacking activities that illegally penetrate systems or alter/destroy data. (This Supreme Court ruling is significant as the first case to legally confirm that crawling does not fall under the ‘hacking’ category of information network intrusion.)

Practical Applications of Web Crawling:

  • Search Engines: Google, Naver, etc., collect worldwide web information through crawlers and provide search services by indexing.
  • Price Comparison: Crawling product information from multiple shopping malls for price comparison and analysis.
  • Data Analysis: Crawling news articles, SNS posts, etc., to analyze social trends and public opinion.
  • Market Research and Competitive Analysis: Collecting competitors’ public prices, service information, etc., for business strategy development.

Legal Regulations and Dispute Patterns Related to Web Crawling:

Most websites allow basic crawling for search engine exposure. However, disputes arise when competitors crawl large amounts of data without authorization for commercial purposes.

  • Technical Prevention Measures (robots.txt): Website operators can restrict or allow specific crawler access through `robots.txt` files. However, this is non-binding recommendation level, and not all crawlers comply with it.
  • Evolution of Legal Issues: While initially personal information infringement was the main concern, as data became core assets, discussions from competition law perspectives (whether data access denial restricts market competition, or whether crawling itself is an unfair competition means) became active.
  • Criminal Law Interest: Discussion of criminal punishment possibilities was relatively limited. This Supreme Court ruling is notable for directly addressing this criminal liability issue of crawling activities.

3. Supreme Court Ruling: Criminal Liability Standards for Web Crawling

The target ruling (Supreme Court 2021Do1533) found that the appeals court’s judgment of not guilty on all three aforementioned charges was appropriate. The Supreme Court’s specific judgment grounds for each charge are as follows.

Information Network Intrusion Crime (Information Network Act Violation): “Access Authority Must Be Judged Objectively”

Core Issue: Did the defendants have ‘legitimate authority’ to access the victim company’s API server?

Supreme Court’s Judgment Standard: The presence or absence of ‘access authority’ in information network intrusion crimes should not be based on the service provider’s subjective intent, but should be carefully judged by comprehensively considering ① whether there were technical protective measures to prevent access, ② whether access methods or permitted ranges were specified in terms of service, etc. – ‘objectively revealed circumstances.’ This established new legal principles.

Application to the Case:

  • The victim company’s API server URLs could be easily identified through packet analysis, and there were no separate authentication procedures or technical protective measures blocking access.
  • While terms of service contained provisions prohibiting automatic connection programs, these were interpreted as applying to members, and it was difficult to see them as applying to non-member defendants, and the content itself was difficult to interpret as prohibiting API server access itself.
  • The victim company’s IP blocking was merely a technical measure due to mass calls, and it was difficult to see this as objectively expressing intent to prohibit all access through other IPs besides the blocked IP.

Database Producer Rights Infringement (Copyright Act Violation): “Not Replication of ‘Substantial Part’ of Database”

Core Issue: Did the information replicated by defendants constitute the ‘whole or substantial part’ of the victim company’s database?

Supreme Court’s Judgment Standard: Whether it constitutes a ‘substantial part’ should comprehensively consider both the ‘quantitative’ aspect compared to the entire database scale and the ‘qualitative’ aspect of the importance that part holds in the investment or effort for database construction.

Application to the Case:

  • Information collected by defendants was only 3-8 items out of about 50 total items, making it difficult to consider ‘quantitatively’ substantial.
  • The collected information (business names, addresses, prices, etc.) was mostly information disclosed to users by the victim company for business purposes or easily knowable through normal app use, so ‘qualitative’ importance was also judged to be low.

Computer Obstruction of Business Crime (Criminal Code Violation): “Not Input of ‘Improper Commands'”

Core Issue: Did the defendants’ input of extensive search commands beyond the app’s normal range to the API server constitute ‘improper command’ input, and did this cause server failures that obstructed business?

Supreme Court’s Judgment Standard: ‘Improper commands’ mean commands that contradict the normal purpose and method of use that the system anticipates. This should also be judged based on objectively revealed system permission ranges rather than administrator’s subjective intent.

Application to the Case:

  • The victim company’s API server was basically designed to return information according to given command syntax and set no explicit restrictions on search radius, etc.
  • Therefore, the defendants’ requests for information by setting broad search ranges within the command syntax format allowed by the API server could not be viewed as ‘improper commands’ contrary to system purposes.

4. Legal Significance and Practical Implications of the Ruling

This Supreme Court ruling presented important legal standards for criminal liability of web crawling.

Major Legal Significance:

  1. Emphasis on ‘Objective Circumstances’: When determining whether information network intrusion or obstruction of business crimes are established, it clarified that judgment should be based on technical protective measures, explicit terms of service, and other objectively revealed circumstances rather than service providers’ subjective intent regarding access authority or command impropriety.
  2. Emphasizing Importance of Technical Protective Measures: It became important for website or API server operators who want to restrict data access to prepare substantial technical access control means beyond simply stating in terms.
  3. Specifying Database ‘Substantiality’ Judgment Standards: It reconfirmed that the meaning of ‘substantial part’ when judging database rights infringement under copyright law should be comprehensively considered from quantitative and qualitative aspects.
  4. Not Blanket Immunity for All Crawling: This ruling absolutely did not grant immunity to all types of web crawling. If circumstances differ, such as bypassing strong technical protective measures or unauthorized taking of non-public information or core and substantial parts of databases, the possibility of criminal liability remains open.

Practical Impact and Guidelines:

From Data Provider Perspective:

  • To prevent crawling, multi-layered technical protective measures such as robots.txt file settings, IP-based access restrictions, CAPTCHA systems, API key authentication must be implemented.
  • Crawling prohibition clauses should be specified in terms of service, clearly stating application to non-members and specifying concrete sanctions for violations.
  • Important data should be accessible only after membership registration and login, establishing clear access authority systems.

From Crawling Performer Perspective:

  • Check and comply with target sites’ robots.txt files, and adjust request intervals to avoid overloading servers with excessive requests.
  • Carefully review terms of service, and if crawling prohibition clauses exist, carefully assess legal risks.
  • Avoid collecting personal information or core trade secrets, focusing on publicly available information.
  • Replicating all or substantial parts of databases carries copyright law violation risks, so collect only necessary minimum information.

5. Conclusion

Supreme Court ruling 2021Do1533 presented an important milestone for the legally permissible scope of web crawling, which has become essential technology in the data economy era, particularly regarding criminal liability issues. It can be evaluated as requiring service providers to take clearer and more objective measures for data protection, while setting boundaries that companies utilizing crawling technology should observe to avoid improperly infringing others’ information network stability or database rights.

Legal discussions surrounding web crawling will continue alongside technological development. This ruling makes efforts to find balance between free data use and innovation promotion, and protection of information subjects’ rights and service stability increasingly important.

Professional legal services are needed to help companies pursue innovation while effectively managing legal risks in response to changing legal environments alongside automation technology development.

K&P Law Firm Practice Cases

About the Author

Taejin Kim | Managing Partner, K&P Law Firm
Attorney specializing in Corporate Advisory, Corporate Disputes, Corporate Criminal Law
Former Prosecutor | 33rd Class of Judicial Research and Training Institute
Korea University LL.B, LL.M. in Criminal Law, University of California, Davis LL.M.

Visit K&P Law Firm Website


Similar Posts

답글 남기기

이메일 주소는 공개되지 않습니다. 필수 필드는 *로 표시됩니다