“Criminal Liability for Web Crawling: Complete Analysis of Supreme Court Case 2021Do1533






Criminal Liability for Web Crawling: Complete Analysis of Supreme Court Case 2021Do1533


In the modern digital economy, as big data and machine learning technologies develop rapidly, the utilization of ‘Web Crawling’ techniques for automatically collecting large volumes of information from the internet has explosively increased. From portal searches to e-commerce price monitoring and financial investment information analysis, services that would be unimaginable without crawling technology surround our daily lives. However, behind these technological innovations lie complex legal issues. There was a particular need for clear standards on when and to what extent crawling competitors’ core data is legally permissible.

Against this backdrop, there was a significant Supreme Court ruling (Supreme Court Decision 2021Do1533, decided May 12, 2022). This is evaluated as a historic precedent that first clarified the possibility of criminal punishment in a data crawling case between accommodation industry platforms.

1. Case Background: Competitor Data Crawling Dispute

This case involves a criminal trial where employees of Company A (defendants), which operates accommodation information provision and reservation mediation services, were prosecuted for allegedly obtaining data through automated methods from a mobile application (‘Direct Reservation’) and website operated by Company B (victim company), a competitor in the same industry.

Key Points of the Indictment:

  • Information Acquisition Process: The defendants utilized packet analysis tools to decode the victim company’s app program source code and API server access information (module structure, URL addresses, command syntax, etc.).
  • Circumventive Access: Based on the decoded information, they disguised themselves as regular users and directly accessed the victim company’s API server from PC environments.
  • Crawling System Development and Operation: They developed and used a crawling system that could collectively retrieve information about all accommodation facilities within a specific area (1000km radius from the defendant company’s center). (The app typically provides information only within 7-30km of the user’s location)
  • Mass Data Collection: Over approximately 4 months (June 1 – October 3, 2016), they operated the crawling system 1-2 times daily to access API servers and unlawfully copy information such as business names, locations, and room names of affiliated accommodation businesses.

Applied Legal Charges:

  • Violation of the Act on Promotion of Information and Communications Network Utilization and Information Protection (Unauthorized computer network invasion, etc.)
  • Copyright Act violation (Infringement of database producer’s rights)
  • Criminal Code violation of obstruction of business using computers, etc.

Stark Contrast Between First and Second Instance Rulings:

First Instance Court (Guilty verdict):

  • (Regarding computer network invasion) The court determined it was an invasion act based on comprehensive consideration of factors including: the victim company managing API information as confidential, explicitly prohibiting automatic connection program use in terms of service, and defendants continuing to access by changing IPs despite IP blocking measures.
  • (Regarding copyright law violation) The court recognized systematic and organizational unauthorized copying of databases over 6 months across 264 instances.
  • (Regarding obstruction of business) The court determined that requesting nationwide information exceeding the app’s normal search scope caused mass calls that obstructed business operations.

Appeals Court (Not guilty verdict):

  • (Regarding computer network invasion) The court ruled not guilty based on factors including: API access requiring no membership registration or passwords, packet analysis being common technology, absence of technical measures to hide API server URLs or block access, collected information being public information where search scope expansion alone cannot be considered exceeding access authority, and terms of service applying only to members.
  • (Regarding copyright law violation) The court determined that collected information items (3-8) did not constitute a ‘substantial part’ compared to total items (around 50), and the content was mostly public information obtainable through normal app use.
  • (Regarding obstruction of business) The court viewed that since API servers are designed to return information according to commands, requesting information within permitted command syntax ranges is not ‘inputting false information or improper commands,’ and server failure dates could coincide with normal usage increase periods.

With first and second instance rulings being diametrically opposed, industry attention focused on the Supreme Court’s final judgment regarding criminal liability for web crawling.

2. Web Crawling Technology and Related Legal Issues

To understand the core of the Supreme Court ruling, it’s necessary to first accurately grasp the essence of web crawling technology and the legal issues surrounding it.

Technical Definition of Web Crawling:

Web crawling is a process of systematically exploring and collecting web page data on the World Wide Web through automated software called ‘Crawlers.’ Starting from designated web addresses (URLs), it sequentially traces hyperlinks within pages while downloading and storing data in a chain reaction.

  • Distinction from Web Scraping: While scraping focuses on extracting and processing specific data from web page displays, crawling emphasizes collecting web pages themselves and moving through links to gather extensive data. However, in practice, they are often used interchangeably for data collection purposes.
  • Clear Distinction from Hacking: Crawling fundamentally involves accessing ‘public’ servers on the internet to collect information, making it essentially different from hacking acts that illegally penetrate systems or manipulate/destroy data. (This Supreme Court ruling is significant as the first case to legally confirm that crawling does not fall under the ‘hacking’ category of computer network invasion.)

Practical Applications of Web Crawling:

  • Search Engines: Google, Naver, etc., collect and index global web information through crawlers to provide search services.
  • Price Comparison: Crawling product information from multiple shopping malls to provide comparative price analysis.
  • Data Analysis: Crawling news articles, SNS posts, etc., to analyze social trends and public opinion.
  • Market Research and Competitive Analysis: Collecting competitors’ public pricing and service information for business strategy development.

Legal Regulation and Dispute Patterns Related to Web Crawling:

Most websites allow basic crawling for search engine exposure. However, disputes arise when competitors crawling large amounts of data without authorization for commercial purposes.

  • Technical Prevention Measures (robots.txt): Website operators can restrict or allow specific crawler access through `robots.txt` files. However, this is a non-binding recommendation level, and not all crawlers comply with it.
  • Evolution of Legal Issues: While initially focused on privacy invasion issues, as data became core assets, discussions from competition law perspectives (whether data access denial restricts market competition, or whether crawling acts themselves constitute unfair competition means) became active.
  • Criminal Law Interest: Discussions about criminal punishment possibilities were relatively limited. This Supreme Court ruling gains attention for directly addressing criminal liability for crawling acts.

3. Supreme Court Ruling: Standards for Web Crawling Criminal Liability

The target ruling (Supreme Court 2021Do1533) determined that the appeals court’s judgment of not guilty on all three aforementioned charges was appropriate. The Supreme Court’s specific reasoning for each charge is as follows:

Computer Network Invasion Crime (Information Network Act Violation): “Access Authority Must Be Judged Objectively”

Core Issue: Did the defendants have ‘legitimate authority’ to access the victim company’s API server?

Supreme Court’s Judgment Standard: The presence or absence of ‘access authority’ in computer network invasion crimes should not be based on the service provider’s subjective intent, but rather ① whether there were technical protection measures to prevent access, ② whether access methods or permitted scope were specified in terms of service, etc., should be carefully judged by comprehensively considering ‘objectively revealed circumstances’, presenting new legal principles.

Application to the Case:

  • The victim company’s API server URLs could be easily identified through packet analysis, and there were no separate authentication procedures or technical protection measures blocking access.
  • While terms of service contained provisions prohibiting automatic connection programs, these were interpreted as applying to members, making it difficult to consider them applicable to non-member defendants, and the content itself was difficult to interpret as prohibiting API server access itself.
  • The victim company’s IP blocking was merely a technical measure due to mass calls, and could not be seen as objectively expressing intent to prohibit all access through other IPs besides the blocked IP.

Database Producer Rights Infringement (Copyright Act Violation): “Not Copying a ‘Substantial Part’ of the Database”

Core Issue: Did the information copied by defendants constitute ‘all or a substantial part’ of the victim company’s database?

Supreme Court’s Judgment Standard: Whether it constitutes a ‘substantial part’ must comprehensively consider both ‘quantitative’ aspects compared to the overall database scale and ‘qualitative’ aspects regarding the importance of that part in the investment and effort for database construction.

Application to the Case:

  • The information collected by defendants was only 3-8 items out of approximately 50 total items, making it difficult to consider ‘quantitatively’ substantial.
  • The collected information (business names, addresses, prices, etc.) was mostly information disclosed to users by the victim company for business purposes or easily obtainable through normal app use, so ‘qualitative’ importance was also judged to be low.

Computer Obstruction of Business Crime (Criminal Code Violation): “Not Input of ‘Improper Commands'”

Core Issue: Did the defendants’ input of extensive search commands beyond the app’s normal scope to API servers constitute ‘improper command’ input, causing server failures that obstructed business?

Supreme Court’s Judgment Standard: ‘Improper commands’ mean commands that go against the normal usage purposes and methods anticipated by the system. This should also be judged based on objectively revealed system permission ranges rather than administrator’s subjective intent.

Application to the Case:

  • The victim company’s API server was fundamentally designed to return information according to given command syntax, with no explicit restrictions on search radius, etc.
  • Therefore, defendants requesting information by setting broad search scopes within the command syntax formats allowed by the API server itself could not be viewed as ‘improper commands’ contrary to the system’s purpose.

4. Legal Significance and Practical Impact of the Ruling

This Supreme Court ruling presented important legal standards for criminal liability in web crawling.

Major Legal Significance:

  1. Emphasis on ‘Objective Circumstances’: When determining the establishment of computer network invasion or obstruction of business crimes, it clarified that access authority presence or command impropriety should be judged based on technical protection measures, explicit terms of service, and other objectively revealed circumstances rather than service providers’ subjective intentions.
  2. Emphasis on Importance of Technical Protection Measures: For website or API server operators wanting to restrict data access, it became important to implement substantial technical access control measures beyond simply stating in terms of service.
  3. Specification of Database ‘Substantiality’ Judgment Standards: It reconfirmed that the meaning of ‘substantial part’ in copyright law database rights infringement judgments must be comprehensively considered from quantitative and qualitative aspects.
  4. Not a Blanket Exemption for All Crawling: This ruling by no means gave blanket exemption to all types of web crawling. If circumstances differ, such as circumventing strong technical protection measures or unlawfully taking non-public information or core and substantial parts of databases, the possibility of criminal liability recognition remains open.

Practical Impact and Guidelines:

From Data Providers’ Perspective:

  • To prevent crawling, implementation of multi-layered technical protection measures such as robots.txt file settings, IP-based access restrictions, CAPTCHA systems, and API key authentication is necessary.
  • When specifying crawling prohibition clauses in terms of service, clearly state applicability to non-members and specify concrete sanctions for violations.
  • Important data should be accessible only after membership registration and login, requiring clear establishment of access authority systems.

From Crawling Performers’ Perspective:

  • Check and comply with target sites’ robots.txt files, and avoid excessive requests that burden servers by controlling request intervals.
  • Carefully review terms of service, and if crawling prohibition clauses exist, carefully assess legal risks.
  • Avoid collecting personal information or core trade secrets, focusing on publicly available information collection.
  • Since copying all or substantial parts of databases carries copyright law violation risks, collect only necessary minimum information.

5. Conclusion

Supreme Court Case 2021Do1533 presented an important milestone for the legally permissible scope of web crawling, which has become an essential technology in the data economy era, particularly regarding criminal liability issues. It can be evaluated as requiring service providers to take clearer and more objective measures for data protection, while setting boundaries for companies utilizing crawling technology to be careful not to unduly infringe upon others’ information network stability or database rights.

Legal discussions surrounding web crawling will continue alongside technological advancement. Based on this ruling, efforts to find balance between free data utilization and innovation promotion, and protection of information subjects’ rights and service stability will become increasingly important.

K&P Law Firm has experience successfully defending clients in recent inter-company automated program disputes and possesses expertise in legal risk analysis and response strategy development regarding IT companies’ data collection and utilization. If you are concerned about legal issues related to web crawling, we are ready to provide consultation anytime.

K&P Law Firm Case Studies

About the Author

Taejin Kim | Managing Partner, K&P Law Firm
Attorney specializing in Corporate Advisory, Corporate Disputes, Corporate Criminal Law
Former Prosecutor | 33rd Class of Judicial Research and Training Institute
Korea University LL.B, LL.M. in Criminal Law, University of California, Davis LL.M.

Visit K&P Law Firm Website


Similar Posts

답글 남기기

이메일 주소는 공개되지 않습니다. 필수 필드는 *로 표시됩니다