A game-changing event happened in November 2023: OpenAI, an American AI (artificial intelligence) research company, launched its ChatGPT service, a general-purpose chatbot that can generate texts.
The rapid development in AI content generation following this event has caused many people and institutions to worry about AI content generators’ adverse impact on various sectors and industries.
For instance, business owners today may be anxious to know whether they are getting their money’s worth on high-quality online content or wasting their hard-earned money on fake content.
Fortunately, AI content detection tools are also innovating at an equal pace with AI text generators’ developments.
However, if you work in the content creation industry, you would want to know if these AI content detectors actually live up to their hype.
Moreover, you may wonder if these tools can protect you from unethical content creators who submit AI-generated content as their own (we’ve caught out our fair share). As a search agency heavily focused on editorial SEO with a view to long-term niche domination, ensuring all our internal and contract writers submit human-researched and written content (thereby operating well within the guidelines of Google’s TOS) is something we rely on heavily.
Our own investigation and results on AI content detectors indicate that, while AI detection technology is relatively new and each AI detection tool varies in accuracy, the detectors were fairly good (when combined) in differentiating between human-written and AI-generated texts, with a low % of false positives (if you follow certain rules).
Using the six tools below, we test the articles that the writers submit from our internal writers (control) and our contract writers:
We used the data from this table to compute the percentages of AI content in the contract external writers’ submitted articles.
As of February 2023, we’ve gathered the following data and information from our investigation using the articles submitted by the external agency, and here are the real-world results we got:
We constantly update our dataset to give real-time information and improved evaluations.
The overall average AI% score (using the highest AI% scores across all tools) is 18.05%.
As of February 2023, Originality.ai has the highest accuracy and optimal features of the six AI detection tools we investigated.
AI detection is the process of detecting AI-generated content.
Free and paid AI detection tools can help business owners, journalists, and marketing agencies determine whether a piece of content is human-written (original) or AI-generated.
AI content detector tools or AI detectors use various methods to identify machine-generated content. For example, they can do the following:
Other AI detectors can detect AI plagiarism by conducting a train-test-validation split on a dataset.
The GPT-2 output detector uses this method on open-source data from OpenAI to examine the likelihood that OpenAI ChatGPT generated the text.
AI-generated text is a type of content that is digitally produced using machine learning (ML) and natural language processing (NLP).
ML is a branch of AI and computer science that enables systems to simulate human intelligence, allowing machines to predict outcomes accurately and improve over time.
NLP, a subfield of ML, is the process of converting raw text data to computer-understandable (structured) data. This NLP uses this structured data to create human (natural) language.
AI content generation is revolutionizing the content creation industry. We’ve yet to see whether this technology will be a net positive or negative for the digital world.
Meanwhile, business owners, journalists, and digital marketers today are searching for ways to detect AI content.
At Digital Spotlight, we investigated our external and internal writer’s content using six AI content detection tools, including AI Classifier, Originality.ai, Copyleaks, Content at Scale, Corrector App, and Crossplag, to ensure that the articles are 100% human-written.
Our standard process is to copy and paste the article draft into the AI detection tool. Any output with more than a 25% AI score on any AI content detector is considered failed, meaning that it was likely produced by AI.
Note that a 25% AI score does not mean that the article has 25% AI content. Instead, a 25% AI score means there’s a 25% probability that the article is AI-generated.
Our current data shows that 3 of the 21 articles reviewed failed in two AI detection tests. Meanwhile, five articles failed in Originality.ai alone.
We then provide the feedback to the writer and request a revision to match our standards. We continue to optimize the articles until we get a less than 25% AI score.
This strict content creation process shows how Digital Spotlight’s values original content – the kind of content that is relatable and engaging.
The following table shows our latest AI detection report on our internal and external writers’ submitted work, and here are the real-world results we got:
|Article||Writer||AI % Score from Originality.ai||AI % Score from Copy Leaks||AI % Score from Content Scale||AI % Score from Corrector App||AI % Score from Crossplag|
|Article 1||Writer 1||16%||94.20%||99%||4.30%||1%|
|Article 2||Writer 2||29%||96.30%||100%||0.16%||1%|
|Article 3||Writer 3||15%||94.20%||98%||10.22%||1%|
|Article 4||Writer 4||20%||98.10%||100%||3.67%||1%|
|Article 5||Writer 5||28%||93.70%||91%||3.12%||1%|
|Article 6||Writer 6||5%||94.30%||95%||1.55%||1%|
|Article 7||Writer 7||22%||98.70%||100%||0.03%||1%|
|Article 8||Writer 8||41%||85.40%||86%||11.88%||1%|
|Article 9||Writer 9||33%||95%||100%||7.74%||1%|
|Article 10||Writer 10||4%||96.40%||100%||0.16%||1%|
|Article 11||Writer 11||12%||98.20%||100%||8.16%||1%|
|Article 12||Writer 12||70%||80.09%||78%||27.65%||9%|
|Article 13||Writer 13||23%||80.90%||86%||6.06%||16%|
|Article 14||Writer 14||10%||96.40%||99%||0.18%||18%|
|Article 15||Writer 15||52%||89.5%%||93%||7.81%||1%|
|Article 16||Writer 16||74%||89.50%||100%||19.92%||1%|
|Article 17||Writer 17||26%||95.65%||85%||7%||35%|
|Article 18||Writer 18||43%||81.04%||76%||24.46%||16%|
|Article 19||Writer 19||19%||91.10%||100%||7.86%||1%|
|Article 20||Writer 20||24%||93.80%||96%||0.18%||5%|
|Article 21||Writer 21||90%||83.70%||79%||20.60%||1%|
|Article 22||Writer 22||18%||98.80%||100%||0.03%||1%|
|Article 23||Writer 23||9%||93.70%||99%||6.08%||1%|
|Article 24||Writer 24||15%||96.10%||92%||0.08%||58%|
|Article 25||Writer 25||13%||97.30%||99%||1.16%||0%|
|Article 26||Writer 26||73%||79.10%||88%||19.68%||0%|
|Article 27||Writer 27||11%||64.60%||88%||0.54%||0%|
|Article 28||Writer 28||37%||94.60%||96%||2.91%||0%|
|Article 29||Writer 29||3%||95.20%||99%||2.61%||1%|
|Article 30||Writer 30||11%||93.50%||97%||0.81%||1%|
|Article 31||Writer 31||1%||97.20%||99%||0.59%||1%|
|Article 32||Writer 32||7%||94.80%||98%||0.63%||1%|
|Article 33||Writer 33||24%||92.30%||98%||4.88%||1%|
|Article 34||Writer 34||17%||98.10%||99%||1.02%||1%|
|Article 35||Writer 35||6%||97.70%||99%||0.30%||1%|
|Article 36||Writer 36||9%||94.70%||92%||0.58%||1%|
|Article 37||Writer 37||20%||48.32%||100%||3.80%||65%|
|Article 38||Writer 38||7%||98.70%||98%||0.22%||1%|
|Article||Writer||AI % Score from Originality.ai||AI % Score from Copy Leaks||AI % Score from Content Scale||AI % Score from Corrector App||AI % Score from Crossplag|
|Article 1||Writer 1||60%||15.50%||7%||0.53%||1%|
|Article 2||Writer 2||40%||7.60%||0%||2.00%||1%|
|Article 3||Writer 3||5%||1.10%||0%||2.00%||1%|
|Article 4||Writer 4||29%||20.20%||3%||2.00%||1%|
|Article 5||Writer 5||4%||9%||1%||0.08%||2%|
|Article 6||Writer 6||0%||9%||0%||0.46%||1%|
|Article 7||Writer 7||8%||5.90%||14%||0.03%||1%|
|Article 8||Writer 8||4%||4.50%||0%||0.02%||1%|
|Article 9||Writer 9||0%||0.10%||0%||5.00%||1%|
|Article 10||Writer 10||1%||3.4%%||0%||0.03%||1%|
|Article 11||Writer 11||36%||14.50%||9%||0.02%||1%|
|Article 12||Writer 12||44%||5.30%||17%||0.02%||1%|
|Article 13||Writer 13||7%||3.2%%||15%||2.78%||1%|
|Article 14||Writer 14||8%||1.30%||2%||0.02%||1%|
|Article 15||Writer 15||0%||0.50%||15%||0.02%||1%|
|Article 16||Writer 16||2%||4.40%||0%||0.02%||1%|
|Article 17||Writer 17||1%||3.20%||8%||0.02%||1%|
*Note: We regularly update these trackers to give accurate results and feedback.
Readers must also note that AI content detection technology is relatively new.
These AI content detection tools require much improvement, especially regarding their accuracy and reliability in predicting AI content.
For example, a 2,438-word article from one of our internal writers initially got a 32% AI content score from Crossplag. However, the writer managed to bring the Crossplag AI percentage score down to 5% simply by switching the placements of two paragraphs.
Another article (with a 1,938-word count) from our internal writers initially got a 72% human content score from Content at Scale. However, the human content score improved after a mere change in line spacing and became 92%.
These real-world examples indicate that users should consider AI content detectors as just one of the many tools used in a holistic evaluation of a writer’s work.
On January 31, 2023, OpenAI launched AI Classifier, a fine-tuned GPT model trained to differentiate between human-written and AI-generated text.
AI Classifier can help identify AI-generated text, including ChatGPT-generated text. The tool aims to counter false claims that a piece of AI-generated content is human-written.
Despite the tool’s potential uses, OpenAI stated that its newly released AI detection tool is not yet fully reliable.
For example, based on OpenAI’s evaluations, the AI Classifier correctly identified 26% of AI-produced text (true positives) as “likely AI-written.” However, the tool incorrectly marked human-made content as AI-written 9% of the time.
Consequently, they advised users not to use AI Classifier as a primary decision-making tool but only as a complementary method to other tools.
The company listed some limitations that you should consider when using AI Classifier, including the following:
AI text generators have raised concerns among educators about the potential for AI-enabled cheating.
Some educators are asking the AI community for input on how to detect AI-produced content in educational settings better.
OpenAI published a preliminary resource regarding ChatGPT, which lists the uses and limitations of the chatbot.
While significant limitations exist for the AI Classifier, OpenAI can update and retrain the language model based on successful attacks.
For example, OpenAI tweaked the confidence threshold for their web app so that the false positive rate becomes low, meaning the app only marks text as “AI-generated” if the classifier is very confident.
We also use Originality.AI, a paid tool, as a plagiarism checker and AI detector.
The founder of Originality reported that 99.41% of the time, the company’s AI detector correctly identifies a text as GPT-3-, GPT-3.5- (the latest OpenAI model), or ChatGPT-generated.
At Digital Spotlight, it’s a standard policy that if a piece of content constantly shows less than a 25% AI score, we consider it most likely human-written.
Originality.AI has a tool that checks entire websites at once. However, AI checks consistently showing low or high detection scores should be your most significant indicator of AI-generated content.
Another tool that we use to build our dataset of AI% scores is the Copyleaks AI Detector.
The Copyleaks AI detector is a free tool that can help identify texts produced by almost any AI content generator, including ChatGPT.
Our latest data shows that Copyleaks is also quite good at detecting AI content. However, the tool might be less accurate in AI content detection than Originality’s AI detector.
Like other AI detection tools, Copyleaks AI detector can simulate human learning to improve its AI detection abilities over time.
Copyleaks uses full spectrum protection to detect a broad range of AI-written texts, from simple text generators to advanced deep-learning models.
Users can benefit from the tool’s features below.
Users can integrate Copyleaks AI Detector into other software applications using the tool’s API.
If you are a business owner, you can use this feature to access the detector’s functionality within your company’s workflow and systems.
Educators can access the functionality of Copyleaks AI Detector from their native LMS platform. They can use the tool’s plagiarism and AI content detection to verify the students’ submissions.
Users can also access AI detection using the Copyleaks single, easy-to-use interface to verify content originality.
In addition, Copyleaks AI Detector is available as a Google Chrome extension, enabling users to verify the originality of the web pages they visit.
Consequently, you can check posts on your favourite shopping sites and news articles on social media.
The Copyleaks AI Detector works across various languages, including English, Spanish, German, French, and Portuguese.
We also used Content at Scale AI Detector to collect AI% scores from various articles. Our latest data indicates that this tool has similar accuracy to Copyleaks AI Detector.
Content At Scale AI Detector provides users with a human content score, indicating whether a text is AI-generated or humanly optimized.
Like the AI detectors above, the Content At Scale AI Detector can help identify if the content comes from an AI writer, including ChatGPT, GPT-3, and other AI models.
This free AI detection tool is specifically designed to identify whether the content is partially or entirely GPT-3 algorithm-created.
Generative Pre-trained Transformer (GPT)- 2 content detectors can detect texts generated by GPT-2 based AI.
For example, Huggingface, a New York-based data science company, developed the GPT-2 output detector to detect OpenAI-generated content.
GPT-3 is a machine learning neural network trained using internet data to generate text of any type.
Examples of GPT-3 content detectors include Originality.ai, Copyleaks AI Detector, and Writer AI Content Detector.
These tools offer free and paid services. You can click their pricing tab to see what services best apply to you.
As of January 2023, the CEO (chief executive officer) of OpenAI has yet to confirm the release of GPT-4. This update implies that there are no existing GPT-4 content detectors today.
Crossplag is a fine-tuned model based on the OpenAI dataset. This tool can analyze 1,000 words at a time and only works for English texts.
Crossplag is another AI content detector designed to identify AI-generated content across a wide range of applications.
Our data suggest that Crossplag may be less accurate than most existing AI detection apps.
For example, Originality.ai detected 60% AI content in “Article 1”. In contrast, Crossplag gave the same article a 1% AI content score.
Other AI detection mechanisms can also help with AI content detection.
For example, another potential AI detection method is “watermarking.” A watermark for chatbots can signal AI-written texts.
Some reports say that OpenAI, the creator of ChatGPT, is currently developing a watermark to determine output from its GPT text AI.
You can use AI detection tools in various ways. Below are some areas in which AI detection apps can help you.
The pace of development in AI content generation suggests that AI writing will become more capable of producing realistic, human-like content.
For many, this scenario means the best AI systems will eventually succeed in evading AI content detection mechanisms.
However, while some computer scientists work to make AI writers more human-like, others develop AI detection tools to improve their detection accuracy.
Digital Spotlight’s current data suggests that AI content detectors can detect AI content.
Consequently, we believe high-quality content that provides real value to users remains a great asset to those working in the content creation industry.
Original, creative content by human writers provides fewer generalities and more valuable insights – human-written content is more relatable and engaging.
On January 2, 2023, Edward Tian, a senior at Princeton University, tweeted his beta version of GPTZero.
GPTZero is a software application that aims to detect whether a piece of text is human-written or ChatGPT-generated. The app uses these two metrics to make its “judgments”: perplexity and burstiness.
Perplexity refers to the randomness of the text’s word choice, while burstiness compares this factor across sentences.
High perplexity score suggests that a sentence is human-made. Burstiness is the perplexity score of the entire content.
Our data indicate that existing AI-writing detection tools have different detection accuracies.
These results suggest that even high AI content percentage scores may only sometimes mean that the text’s author is an AI.
Suppose you are an educator. In that case, you should use AI content detection software to provide feedback to your students instead of just focusing on penalizing plagiarized submissions.
Even though current AI-writing detection tools are imperfect, any writer hoping to pass off an AI writer’s work as their own could be exposed as detection tools become accurate.
The education system can adapt to the post-ChatGPT world by understanding what makes human-written texts different from AI-generated ones.
For example, the Higher Education system can focus more on oral-based evaluations or optimize exams without using digital technologies.
Catching a Unicorn with GLTR: A tool to detect automatically generated text
English Google SEO office-hours from November 2022
Spam policies for Google web search
New AI classifier for indicating AI-written text
OpenAI CEO Refuses to Confirm If Chat GPT—4 Will Be Released This Year
Edward Tian ’23 creates GPTZero, software to detect plagiarism from AI bot ChatGPT