AI Phishing Email Generator & Legitimacy Tester
This application is my cybersecurity senior project focused on demonstrating how artificial intelligence can be used to both generate and detect socially engineered phishing emails using real-world LinkedIn profile data.
The system simulates how attackers craft professional email chains and how defenders can use AI to detect legitimacy based on tone, context, and structure.
How the System Works
- Scrapes public LinkedIn data using automated tools
- Generates AI-crafted phishing-style email chains
- Tests each email using a second AI model for legitimacy
- Stores all results in a MongoDB database
- Supports adjustable temperature, chain size, and AI models
Temperature Testing Results (GPT-3.5 Turbo)
The table below shows how AI temperature impacts the average legitimacy score when generating phishing-style emails. Each test used 200 samples per temperature.
Example Stored Database Record (MongoDB)
Each generated email chain and legitimacy result is stored in MongoDB using structured collections based on chain size, temperature, and AI models.
Collection:
results_chain0_temp0.5_gengpt_gpt-4-turbo_testgpt-4-turbo
{
"_id": ObjectId("673ec392be46a756d819d611"),
"name": "Austin Paulley",
"temperature": 0.5,
"legitimacy_score": 85,
"legitimacy_analysis": "(AI generated analysis response)",
"email_chain": [ "...full email array..." ],
"generating_ai": "gpt-4-turbo",
"testing_ai": "gpt-4-turbo"
}
Example Application Interface (Static Preview)
This is a static preview of the desktop application interface used for my AI phishing email generator and legitimacy tester. The live app is built in Python and connects to LinkedIn data, OpenAI models, and MongoDB. This preview is non-functional and meant only to demonstrate the layout and controls.
Score generated by a separate AI model analyzing tone, structure, and context.
| Temperature | Average Legitimacy Score (%) | Total Tests |
|---|---|---|
| 0.0 | 94.17 | 200 |
| 0.5 | 94.38 | 200 |
| 1.0 | 93.95 | 200 |
| 1.5 | 91.17 | 200 |
| 2.0 | 29.55 | 200 |
Chain Size = 0 • Generation AI = GPT-3.5 Turbo • Testing AI = GPT-3.5 Turbo
Key Findings
- Low temperature settings (0.0–1.0) consistently produced the most realistic and high-legitimacy phishing-style emails, averaging around 94%.
- High temperature values (2.0) caused a significant breakdown in realism and structure, with legitimacy scores dropping to approximately 29%.
- Using separate AI models for generation and testing improved detection reliability and reduced false positives.
- Storing results in MongoDB collections based on temperature, chain size, and models enabled scalable experiment tracking and long-term analysis.
- Multi-step email chains were shown to increase realism over single-message phishing attempts by better simulating real-world attacker behavior.
Source Code & Research
The full source code and detailed research paper for this project are available below. The code focuses on automating email generation and legitimacy testing, while the paper explains the methodology, findings, and cybersecurity implications.