Abstra Company

BREAKING THE SOUND BARRIER IN AI

HOW ABSTRA™ CONQUERED THE LAST MILE AND WHAT BUSINESS NEED TO KNOW ABOUT AI

By Choon H. Lee
CEO, Abstra™ Company

March 1, 2024

Everyone talks about AI these days, and yet, in-depth knowledge is scarce to find. As a result, there is a lot of confusion about this space, or as Kai-Fu Lee - the “Oracle of AI” - pertinently said: “most people have no idea about AI and some people have the wrong idea about AI”. Until recently, AI remained a specialized area known to only a handful of programming nerds. The release of ChatGPT™ in November 2022 forced everyone, including laypersons, to discuss AI intelligently, as though they‘ve spent years in AI research. There is an expectation that non-AI entities such as every day businesses are required to make informed AI investment decisions, yet lack the necessary computer science and AI background to do so.

The goal of this article is to close this gap. It is the objective of this column to equip businesses and other non-tech people involved in the AI discussion alike with the right knowledge about AI, so any discussion can move forward, instead of going in circles. Among others, this writing aims to help understand Abstra™ AI technology to businesses, what this technology exactly does and how it differentiates from what‘s already out there in the world.

This piece - more than anything - is a story. It details AI‘s development, how it works, and the compounded difficulty of building high precision AI systems as they advance, akin to breaking the sound barrier. This is an account of how the Abstra™ company - with nothing but its commitment to defy the odds - took nearly 13 years to achieve the impossible and succeeded in building the world‘s first high precision AI system for document information extraction.

The Father of AI and the Imitation Game

Many people today associate AI with Deep Learning, Large Language Models (LLMs), and Machine Learning; however, these elements constitute only a portion of AI. Some also think that AI was introduced into this world last year with ChatGPT™ and that is also incorrect. Although some discussions about “non-humanoid thinking” can be traced back to as early as the Ancient Greeks and “machine thinking” to philosophers in the 1800s, the first serious and practical discussions about AI have been attributed to British scientists in the mid-1940s.

The first person who put the concept of "Artificial Intelligence" on the map was the British mathematician and computer scientist, Alan Turing, in the 1950s. Turing‘s story was featured in 2014 in the movie “The imitation Game”, starring Benedict Cumberbatch as Alan Turing. The movie, while telling a good story, did little - in views of many critics - to enhance AI awareness and insufficiently recognized Turing as the "father of AI." Ironically, the title of the movie - "Imitation Game" - hinted at the very definition of AI.

Early in the 1950s, Alan Turing introduced a thought-provoking question in a paper he published: “Can machines think?” To answer this question, Alan proposed a simple test known today as the "Turing Test".

The Turing Test

In that era, a favorite British parlor game, the "Imitation Game," placed two players in two distinct, curtain-divided compartments. In this game, a third person, unaware of the identities of people inside the curtain divided cubicles, communicated with both individuals only through written notes. The game‘s objective is for this third person to discern the gender of the people inside the cubicles based on their written correspondences.

Alan ingeniously adapted the parlor game into a thought experiment, substituting one player with an intelligent computer to play the same imitation game. Instead of guessing gender, the third person aimed to distinguish which written answer originated from a human and which from a machine. This test - the famous Turing test - became the de facto definition of AI: an AI machine is a system able to convince the 3rd person in the thought-experiment so effectively that this human cannot tell which response originated from a human or a machine.

Ever since Alan Turing laid the foundation of AI in the 1950s, AI bifurcated into two separate paths:

Expert System
Machine Learning

Both technologies, with their own merits and drawbacks, are capable of passing the "Turing Test".

Expert System

An Expert Systems (ES) is deterministic, guided by specific code dictating their behavior. ES have been around for quite a long time and the first attempts to build AI systems in the infant stages of AI have been ES. E.g. the auto-pilot found in a commercial aircraft is a classic example of a very effective ES and a success story about AI.

The drawback of an ES is the complex, time-intensive, and expensive process of development and expansion of such a system into other areas. For example, it‘s difficult to use an auto-pilot from planes to create a similar one for automobiles based on that same code. The upside, its ability to deliver pin-point accurate results, made it an appealing AI solution where accuracy is of prime importance - such as piloting a plane with hundreds of passengers on board.

Machine Learning

In contrast, Machine Learning (ML) is nondeterministic, without coded instructions for its responses or thought processes. Instead, a computational model such as an LLM is fed, i.e. trained, with billions and trillions of data points. So called “neural networks” form the backbone and a system that enables learning from and making decisions based on data supplied. ChatGPT™ is probably a prime example of a system based on an ML AI architecture.

ML‘s upside is its adaptability given their vast knowledge inputs. An ML responds to inquiries on virtually any topic in human knowledge. The downside with ML is directing ML‘s learning, tantamount to steering a child‘s education in the right direction given the diverse information sources available these days online and offline.

How ChatGPT™ Learns Compared to How Humans Learn

Imagine there are two students, both trying to pass a written exam for a course. The first student studies from the textbook by understanding a concept in a chapter and solving examples, while the second bypasses the textbook entirely and solely focuses on practicing with old exams and reviewing their answers.

It is easy to see that the second student quickly becomes skilled at solving tests after viewing several dozens of old exams. As known, exams typically follow a certain pattern, use certain language, and there is a certain expectation about how the answers need to be expressed.

In contrast, the first student - before even solving any practice examples - needs to understand the theory behind the subject. Once the concept of a subject has been captured, the student can go about applying this new knowledge in practice examples.

Clearly, the first student has chosen a more difficult and longer path, but once this person has understood the concept of the material, he or she can apply this know-how to virtually any situation and example there is.

In contrast, the second student may pass the exam but, lacking theoretical backbone, he or she will struggle to apply concepts beyond the question framework seen in the old practice exams he or she has been trained in.

It is so, because the second student learned the material empirically, solely focused on emulating the right answers, while the first student made a concerted effort not only to mimic the right answer, but understand the theory, the logic, behind the material.

ChatGPT™ learns and responds to questions the user asks a bit like that second student, while humans prefer to widen their mental horizon like the first student. Humans are inherently not happy to just emulate answers like a robot. What distinguishes humans is the ability to ask the “why”. Humans want to know more than just the right response, humans want to know the story that created the right answer.

Why ML such as ChatGPT™ Hallucinates and Has Problems with Accuracy

ChatGPT™ like any other ML application on the market today - at times - have become adept in emulating the right answers, like the second student mentioned before.

Inherent in the way ML is designed, this also illustrates why systems such as ChatGPT™ hallucinate and have problems with accuracy: ML applications inherently lack a “conscious core”, a logical deduction capability to correctly apply the subject matter expertise. This remains an inherent design flaw difficult for ML applications to overcome.

Humans, being prone to errors, habitually use logic to confirm and reconfirm answers. Before answering “5” as the answer to the question of what the sum is when adding “2 plus 3”, people may check in the back of their minds whether subtracting “3” from “5” yields “2”. Ingrained in human thinking is the constant self-questioning, “Does this make sense?”, honed over a lifetime. The lack thereof is why e.g. Google‘s AI system Gemini recently - when asked to draw a picture of the first U.S. President, General George Washington, produced an image of an Afro-American General.

To illustrate another, more business centric example: a CRE broker, using their training and logical deduction ability, understands that a Tenant paying only CAM, RET, INS expenses for the first 3 months of a commercial lease, is practically receiving 3 months of free rent. An ML such as ChatGPT™ will fail to draw such conclusions from reviewing a commercial LOI because such understanding involves logical reasoning within a domain expertise.

This inherent design limitation of ML also explains ChatGPT™ difficulties with accuracy, as seen when abstracting commercial leases and LOIs.

A Key for Businesses to Understand AI

Why does and how does this matter for businesses?

For businesses, grasping an AI application's architecture and type of AI technology implemented is crucial as it instantly uncovers an AI system‘s strengths, weaknesses, capabilities, and limitations without having to examine the inner workings of an app.

Like knowing that someone who used a train to travel from A to B will need to be picked up at a train station without being privy about any details about the train, knowing what AI technology was used - such as ES vs ML - will instantly tell a story about this AI system: what it is capable of and what hurdles this application will face.

Example: there is a new AI technology company called Docsum.ai. It says on its own website that it uses ChatGPT™ to summarize legal documents. The mere use of ChatGPT™, an ML framework, facilitates an understanding about their product's capabilities and constraints without a detailed tech analysis.

As such, it is possible to accurately predict that Docsum.ai will be able to accept as input a variety of legal documents, such as confidentiality, operating, and other similar agreements. At the same time, given that Docsum.ai is based on an ML architecture, which ChatGPT™ is built upon, accuracy issues are guaranteed to happen and most likely will remain its biggest challenge for users.

The Sound Barrier in AI

From its inception, Abstra™‘s founding principle (the “Abstra™ Mission”) was dedicated to achieve a singular goal when building an AI application: accuracy.

Unfortunately, achieving accuracy is both incredibly challenging and costly, yet crucial for the business community. Business applications require high precision of data because businesses in Corporate America function through processes, with participants acting as parts of a complex, well-oiled machine. Corporate workflow participants must act with precision and solutions need to offer accurate data, adhering to the principle that a chain's robustness hinges on its weakest link.

An ML system - due to its inherent architectural design limitation - struggles to provide precision data consistently and reliably. For example, if ChatGPT™ were used to abstract a commercial real estate LOI or lease the accuracy achieved would plateau at around 70 to 80%, given the current state of ML technology. Bear in mind that an average accuracy rate of 75% implies an error of every fourth data point, which inevitably necessitates human eye-balls and manual verification.

Human verification is time consuming and costly and, indeed, may invalidate the need of using AI to automate this process. Verifying AI data manually may cost more and take longer than humans performing the task themselves by hand in the first place without AI. In other words, for an AI system to be useful in a corporate business setting, it requires a tremendously high accuracy threshold, typically in the 98-100% range.

When creating an AI product, achieving a 50-65% accuracy can be reached relatively quickly, and with extra effort, up to 80% accuracy is attainable within a sensible period. However, exceeding the 80% accuracy benchmark requires immense effort for small improvements, akin to crossing the sound barrier. As AI development gets closer to the elusive 100% mark, the curve flattens further, often necessitating extraordinary efforts that may take many months, if not years, for miniscule percentage gains.

APDT™ - A Blueprint How to Break the Sound Barrier in AI

Given that all AI development faces a sound barrier when it comes to data accuracy, it makes sense why the Abstra™ AI application took nearly 13 years to complete. Of those 13 years, the first two were spent on initial experiments, concept development, and refining architectural design. In the subsequent 11 years developing Abstra™'s AI, the last six were dedicated to improving its accuracy from 80% to 98-100%, as shown by the Abstra™‘s AI product, APDT™ for Commercial Real Estate. In other words, it took Abstra™ five years to reach 80% and another six years to cover the last mile and to elevate its accuracy criterion to the 98-100% range.

Abstra™‘s AI Engine - powering the Abstra™‘s APDT™ application - comprises of three distinct AI modules:

AI Text Extraction Engine (ATE)
AI Domain Knowledge and Logical Deduction Module (ADL)
AI Data Management and Organizational Features (ADO)

The ATE is the workhorse responsible for delivering the initial text extractions and this engine is a hybrid architecture that contains elements from both ML and ES AI systems. The ADL module is a pure ES - the “control center" of the AI solution - which ensures that the data is accurate, makes sense, and is delivered to the user in the 98-100% accuracy range. The ADO features a collection of very smart functionalities which ensures the user has a practical work environment and is able to manage the data flow. This third component is built upon an ES-ML hybrid AI platform as well.

Abstra™‘s Groundbreaking AI Innovation: APDT™‘s Hybrid AI System

The innovation needed to break the sound barrier in AI in the first place and to sustain such ability is considerable. As indicated before, AI development doesn't have a perfect answer; both ES and ML platforms carry their specific sets of strengths and weaknesses. The ability to build a high-precision system requires harnessing the strengths from both. This effort is tantamount to making two incompatible systems - one driven by logic, the other by creativity - to talk to each other in a unified language, and that is a formidable challenge in AI development.

At Abstra™, the approach to unifying divergent AI architectures into a unified hybrid system in the past involved having them work together in a carefully balanced, harmonious dance in a perfect lock-step. Like in a dance, both partners need to know exactly what the other is doing, learn to work with the other instead of working against each other. Attaining such harmony involves tremendous practice, continuous fine-tuning, and many failed attempts until the right synergy is found. And, like dancers, sustained practice brings a moment when both AI systems align, working in concert instead of alone.

The performance of a hybrid AI system, when functioning well, is as unmistakable as watching a harmonious dance: Incongruent parts of an AI system working somehow seamlessly together, as guided by an invisible mind, creating results with unprecedented precision and accuracy, like magic.

What Makes Abstra™‘s Technology Scalable: APDT™‘s Lego Blocks

Needless to say, building an AI system is a tremendous, time-consuming exercise. As mentioned earlier, deploying an ES outside its intended domain is difficult, restricting its application and growth.

The way the Abstra™ team solved this problem is by building the entire AI application with “Lego blocks” and making the application subject agnostic and rapidly adaptable to any new field. Instead of designing an application for a specific purpose, by having a set of ''Lego blocks'' to work with, the Abstra™‘s team is able to put together an AI application for document information extraction deployable to any subject area with pinpoint accuracy.

The result of building such a highly modular system is stunning: In the past, building such a new AI application could have taken years, but the modular APDT™ AI technology accomplishes the same feat within 90 days. The APDT™ AI system only needs about 200-400 varied document samples, the desired information, and a subject matter expert's input during that period to achieve a near 98-100% data accuracy threshold.

This makes Abstra™‘s AI solution infinitely scalable to any field in the business world when it comes to extracting information from Documents: an AI solution applicable to any business, to any industry, to any field.

Conclusion

The prime reason businesses hesitate to implement AI solutions today into their workflows is data accuracy, the ability for AI systems today to deliver pinpoint accurate, reliable results consistently. Well-known venture capitalist Chamath Palihapitiya calls most AI applications today “toy apps”: AI tools capable to craft creative Valentine‘s poems for kids or produce striking stock images for home projects, but lack the precision needed for fundamental, serious, and important business tasks such as abstracting commercial LOIs. Clearly, Chamath is talking about the invisible hurdle AI systems can‘t overcome easily these days to cover the last mile - the invisible sound barrier in AI.

People who learn about Abstra™ for the first time sometimes question why its products - AP ™ and APDT™ - required nearly 13 years to develop. Some mistakenly believe the attention to the mission was divided, and nothing could be further from the truth. AI product development ultimately exposes the unwritten law that physics shifts from linear to exponential behavior near the sound barrier: The effort to cross the high accuracy threshold in AI becomes exponentially harder as one gets closer.

It is a testament to the Abstra™ team - given the incredible limitations of resources - that this goal was achieved with nothing but this team‘s unparalleled dedication, its know-how, experience, and working seamlessly together as a team. It tells volumes about Abstra™ presenting the world's first AI Precision System™ (defined as an AI system capable of achieving data accuracy in the 98-100% range) to the business community and defining the term “AI Precision System” in the process. It tells a compelling story about this firm, a realization that people at Abstra™ have not only built a revolutionary product, but a culture capable of revolutionizing the world with AI.

Many viewed and still view building an AI Precision System as a pipe dream, something that cannot be achieved, especially with AI tools like ChatGPT™ around. But, as the late Nelson Mandela once said: “it always seems impossible until it is done.”

The path towards building a high accuracy AI system is a road filled with graveyards of past attempts from companies with far greater resources and countless other AI experts with resumes longer than the next. Yet, this small group of highly dedicated people at Abstra™ never gave up on this promise, persevered like that little engine that could, and that miraculous journey only took a meager 13 years.