Pegasus | Devpost

Welcome to Pegasus!
Pegasus' lightning fast algorithm for indexing and searching securities.
Uploading code to Pegasus' decentralized vulnerability blockchain.
Found Vulnerability on Code
AWS Lambda Function

Pegasus Bio

Inspiration

While brainstorming for this competition, we were particularly intrigued by the Schonfeld and MITRE challenges. How could we push the boundaries of search and indexing engines? How can we also leverage the same innovations on web3 while building a product that individuals and organizations see value in and can use daily? That's how we came up with Pegasus - a lightning-fast and mythical searching & indexing engine with a built-in source code vulnerability blockchain.

What it does

As of right now, our Pegasus engine is focused on two features: 1) Crazy-fast Searching and Indexing Securities and Street IDs: Pegasus lets you upload or drag and drop CSV files for better understanding securities and street IDs. Depending on the dataset, Pegasus develops a priorities model that shows which securities and street IDs are more relevant depending on your queries. 2) Decentralized code vulnerability inspection: Pegasus lets you create or use pre-made vulnerability inspection rules from other companies and organizations to inspect source code. Right now, Pegasus lets you inspect code in the following languages: Javascript, Python, C, and C++. It uses a custom-made regex-based language to define and quickly identify vulnerabilities in code. When a vulnerability is discovered, the Pegasus will alert you of the vulnerability type, point out the lines of code causing the vulnerability, and link to any relevant MITRE pages (CWE, CAPEC, CWE). Pegasus also is constantly scanning the code in the background, so if a new rule is added from a rule set that your organization has selected, it will look through all previously uploaded source code and send an alert. This can allow organizations to fix code the moment a vulnerability is reported to address MITRE Initial Access TTPs at the root of the issue.

How we built it

Let's breakdown our Pegasus engine: Front-end and Graphics: React.js, Javascript, Adobe Illustrator, Adobe Premiere, and Adobe Photoshop Back-end: IPFS, Python, Regex, Flask, AWS Lambda Function, AWS API Gateway, and AWS Mongo Instance. Pre-processor (part of our indexing and searching engine): Node.js

In order to provide a standard format for code vulnerabilities, the code analysis tool uses multiple regex rules written and tested from scratch to perform code analysis. Companies, researchers, and developers can contribute with their own rules by simply creating a .rg file and then sharing it with everyone using the IPFS network framework we provide. The file system works by parsing those files, checking if they are able to compile and if they are good to use. After this verification is done, Pegasus flies to the IPFS and delivers all the new vulnerabilities to the network for indexing & scanning.

Challenges we ran into

For Pegasus' indexing and searching engine, we tried a myriad of approaches to better and quickly get query results to users. Most of our initial approaches led to broken web browsers and slow processing. To overcome this challenge, we managed to create a preprocessor server that processes large datasets of securities and caches the preprocessing results. This way, our front-end is only responsible for querying chunks of data instead of thoroughly pre-processing and locally storing large datasets. For Pegasus' web3 cybersecurity network & code analysis we faced many problems that slowed us down. Since the entire file storage of the project is built on the IPFS, our code took time to be uploaded and retrieved from the network. We also faced problems with different types of encoding for the files, such as regular strings, base64 encoding, utf-8 encoding as well as byte data. It was hard for the regex engine to choose the correct way to search for a pattern if the code and the rule were encoded differently. Another fun issue we found was that a couple of files written in a Windows system produced problems due to the break-line patterns being different from Linux machines. It was pretty hard to debug all that while keeping Pegasus flying on the AWS Lambda Function.

Accomplishments that we're proud of

The Pegasus team is really proud of creating an engine that lets you dissect large datasets of securities and street IDs. While there are many improvements to be made - as with any other software project -, we truly believe that we built something unique and that can be used by people and organizations all over the world.

What we learned

The Pegasus team learned how to better manage, query, and index large datasets and present them in concise, workable, and actionable ways. We also learned the impact of creating a standard way for people to collaborate and make code more secure. As this was not enough, we learned how to directly interact with the IPFS and perform all kinds of operations in their network.

What's next for Pegasus

Our team is looking forward to improving the Pegasus engine for even faster search and index results. We are also looking forward to implementing SQL and pagination on our queries. Additionally, we are also looking forward to expanding our slate of languages supported by our vulnerability blockchain. We also want to improve the vulnerability popup to make things more aesthetically pleasing and to make it easier for users to identify vulnerable lines by adding a line-highlighting feature to the code display.

Built With

capital-one
css
ipfs
javascript
mitre
python
react
regex
web3

Submitted to

ShellHacks 2022
- Winner First Place (Virtual)
- Winner MITRE: Building Web3 (4th Place)

Created by

Hello, everyone! I worked on Pegasus' front-end and its searching and indexing algorithm with React.js and Node.js. I also created our little mascot, Peggy, on Adobe Illustrator.

Mauricio Costa
Computer Scientist & Software Engineer
Welcome everyone! I was responsible for decentralizing Peggy, our Pegasus mascot. I worked on the web3 components of the project to allow files to be retrieved and distributed in the IPFS network. I had the opportunity to work with AWS Lambda Functions and the AWS API Gateway to create a serverless backend.

Nathan Kurelo Wilk
M.S in FinTech, B.S in Computer Science, University of Central Florida
I wrote the code for the vulnerability detection feature including the example rules, all of which use a custom-made extended regex language, and vulnerable example code.

Adam Hassan