ProctorAI: Intelligent Proctoring System Using OpenCV, Mediapipe, Dlib & Speech Recognition
ProctorAI is a real-time AI-based proctoring solution that uses a combination of computer vision and audio analysis to detect and alert on suspicious activities during an exam or assessment. This system uses OpenCV
, Mediapipe
, Dlib
, pygetwindow
, and SpeechRecognition
to offer a comprehensive exam monitoring tool.
🔍 Key Features
- Face detection and tracking using
mediapipe
anddlib
- Eye and pupil movement monitoring for head and gaze tracking
- Audio detection for identifying background conversation
- Multi-screen detection via open window tracking
- Real-time alert overlays on camera feed
- Interactive quit button on the camera feed
⚙️ How It Works
- The webcam feed is captured using
OpenCV
. - Face and eye landmarks are detected using
mediapipe
. dlib
tracks the pupil by analyzing the eye region.- System checks for head movement, eye and pupil movement, and determines if face is present.
- Running applications are scanned using
pygetwindow
to detect multiple active windows. - Background audio is captured and analyzed using
speech_recognition
. - Alerts are displayed on-screen in real-time if any suspicious activity is detected.
🧠 Tech Stack
OpenCV
- Video capture and frame renderingMediapipe
- Facial landmark and face detectionDlib
- Pupil detection and facial geometrySpeechRecognition
- Audio analysisPyGetWindow
- Application window detectionThreading
- For concurrent execution of detection modules
🚨 Alerts Triggered By
- Missing face (student left or covered the webcam)
- Sudden or excessive head movement
- Unusual pupil movement (possibly looking elsewhere)
- Multiple open windows (indicative of cheating)
- Background voice detected (someone speaking)
📦 Installation
git clone https://github.com/anirbanduttaRM/ProctorAI
cd ProctorAI
pip install -r requirements.txt
Also, make sure to download shape_predictor_68_face_landmarks.dat
from dlib.net and place it in the root directory.
▶️ Running the App
python main.py
🖼️ Screenshots
🎥 Demo Video
📌 Future Improvements
- Face recognition to match identity
- Web integration for remote monitoring
- Data logging for offline audit and analytics
- Improved natural language processing for audio context
🤝 Contributing
Pull requests are welcome! For major changes, please open an issue first to discuss what you would like to change.
📄 License
This project is licensed under the MIT License - see the LICENSE
file for details.
Made with ❤️ by Anirban Dutta
No comments:
Post a Comment