Lochness | Devpost

the lochness monster lies in the bayes...

Bot detection is a hard problem. Sites either rely on intrusive measures such as captchas and rate limiting, or fall prey to false user accounts and denial of service attacks. Lochness is a simple solution to this problem. Add our libraries to your website, and gain the ability to block 99.9% of bots, 99.9% of the time. This is achieved by a continuous measure of user credibility across websites using both traditional strategies and Bayesian machine learning, all stored in a radially expanding blockchain across users' browsers. With Lochness, websites can gain immediate knowledge about whether users are malicious, before they even send them a response.

Functional description

Fundamentally, Lochness operates on a decentralized ledger of "human credibility," where each user maintains a cryptographically secure copy of their credibility in their browser storage as a cookie. Credit is determined by everyday actions on the web: the movement of your mouse, the size of your screen, your hashed keystrokes, etc. Every action taken by a user on a site employing Lochness modifies their credit by a certain differential, either increasing or decreasing their credibility. This credit differential is computed through a complex analysis of user data, including not only standard bot detection techniques, but also a bayesian nonparametric model for analysis of timeseries data (thus resistant to adversarial attacks).

Fear not: Lochness does not have access to your data, nor do the services that employ it. Valid cookies can only be generated by trusted authorities from previous valid cookies (and only once); however, it is provably impossible to derive previous actions or credits from a current cookie. Additionally, unauthorized websites cannot read your credit, nor can they modify it. As such, the cryptographic model can be considered a "radially expanding blockchain," where all linear unforkable threads originate from a random genesis block functor.

A practical usage of Lochness might be a page that normally would be protected by rate limiting or captcha, such as a login page or a small VPS instance. If a user visited that page with sufficient credit, they would be allowed free access, no questions asked. If they didn't, they would be presented the opportunity to prove themselves by providing Lochness with more data, an effective "captcha." As such, 99% of users would be free to browse the web smoother and easier than ever before. Lochness makes it much harder to accumulate credit than to lose it; as such, even if a cookie is transferred to a malicious actor, very limited damage can be done until the cookie is depleted (similar in magnitude to a human answering a single captcha for a bot).

Radial blockchain

The radial blockchain is a distributed datastore containing the ephemeral credibility score assigned to each user. Each "block" in the chain contains four pieces of data, each of which has a unique purpose and all of which are encrypted using AES with keys only belonging to our centralized services.

Item	Purpose	Type
Hash of Previous Block	If this hash also in our centralized database it proves that the previous block is valid, allowing each block to only contain information about the single block before instead of the entire chain. It also prevents sharing of blocks, because only one child can be made from each block.	SHA-256
Referrer	Holds the site being accessed when the cookie was updated; useful statistics for bot detection	Raw String
Currency	Holds the actual value of their "credit"	Integer in range 0-1000
Prev data	Holds the data of the previous block; used to track changes in attributes such as User Agent to detect bots.	Gzipped JSON

Of course, for the initial cookie, there is no previous block, so a random genesis block is created, hashed, and the hash is added to our database. The only information stored in our servers is a list of hashes that are currently valid, which (because it cannot be unhashed) cannot be abused to violate the privacy of users. When creating a new block, the previous hash to the "parent" is validated, then removed from the database. Then, the parent is hashed, and the hash is placed in the database used in the creation of the new block. The deletion of the hash when a child is made forces each block to only have one child, preventing cookies from have multiple instances in use.

Radial blockchain diagram

Bayesian model

Our most novel (and powerful) method of computing changes in user credibility is our Bayesian machine learning algorithm. The system operates on user UI manipulations such as mouse drags, mouse clicks and keyboard strokes. The machine learning algorithm utilizes a unsupervised Bayesian model to compares a user's UI inputs to training data compiled over the course of the weekend, computing the likelihood that one is a bot. The system is non-parametric, and thus does not be continually tweaked and tuned; Lochness' analysis performs well across a variety of human behavior and hardware specifications.

The algorithm performed very well on our testing suite. We benchmarked it against a set of our own simple bot scripts, but also against two third-party comment-posting bot implementations. The system crucially had a false positive rate of 99.9% (we would never want a user to be classified as a bot) and a false negative rate of 98.3% for a one page session per agent.

Deployment and testing

A central node exists at the center of the radial blockchain, consisting of a database and an API server. In order to make the system scalable, we wrap the API server and database in docker containers, which can then be deployed either locally or to the cloud. In addition to creating the Lochness central node and accompanying JS and Python libraries to interface with its APIs, we also created a demo site in Flask to demonstrate ease of integration. This site was deployed in docker as well, allowing to put up multiple demos at once and thus demo our ability to mutate cookies across different websites. We also used this site to collect data to train and test our ML model.

We think that Lochness is a viable solution for improving the web: a safer and more painless tomorrow.