COE292
Homework 1: Intelligent Explorer
1. Problem Overview
Inside a hazardous underground facility, engineers have deployed an autonomous robot explorer. The world is modeled as an 𝑛 × 𝑛 grid as shown:
(0,0) |
(0,1) |
(0,2) |
|
(0, 𝑛 − 1) |
(1,0) |
(1,1) |
(1,2) |
|
(1, 𝑛 − 1) |
(2,0) |
(2,1) |
(2,2) |
|
(2, 𝑛 − 1) |
|
|
|
|
|
(𝑛 − 1,0) |
(𝑛 − 1,1) |
(𝑛 − 1,2) |
|
(𝑛 − 1, 𝑛 − 1) |
row/col
NEED THIS PAPER? VISIT ESSAYLINK.NET OR WHATSAPP US OR CALL US AT +1-251-265-5102
For expert academic writing assistance, visit [EssayLink](https://essaylink.net/order) to achieve your goals.
𝑛 − 1
0
1
2
⋮
𝑛 − 1
• The entrance is at the top-left corner (0,0).
• The exit is at the bottom-right corner (𝑛 − 1, 𝑛 − 1).
• The robot may move north, south, east, or west.
• If the robot tries to leave the grid, it stays in place but still consumes energy (counts as a step).
The facility contains obstacles. Exactly 10% of all cells are obstacles, placed in cells given as input to the robot. Entering cells with an obstacle costs the robot three times the energy units to get out when compared to other cells.
At the start, the robot has no knowledge of the best path and the location of the obstacle. Through repeated missions, it must learn until it finds an efficient strategy to reach the exit.
NEED THIS PAPER? VISIT ESSAYLINK.NET OR WHATSAPP US OR CALL US AT +1-251-265-5102
For expert academic writing assistance, visit [EssayLink](https://essaylink.net/order) to achieve your goals.
By the end, the robot should:
• Minimize the number of steps to the exit.
• Avoid obstacles whenever possible (to reduce energy cost).
2. Runtime & Submission
Runtime limit: 10 minutes | Memory limit: 1 GB
Starter file: COE292_Intelligent_Explorer_Starter.ipynb (posted along with this file)
Submission format: a single `submission.py` (export from your notebook) + your AI-disclosure PDF.
3. How Your Code Is Tested
Your code will be tested using two test cases:
1. Test Case 1 (Training): The robot starts with no knowledge. You must run 1000 episodes, show progress by printing the knowledge matrix at specific intervals, and finally output the best path and its energy cost.
2. Test Case 2 (Step-by-step updates): You are given a fixed path, an initial knowledge matrix, and explicit obstacles. You must show how knowledge updates after each step and then print the final matrix.
Both test cases will be executed multiple times with randomly generated numbers within the constraints, ensuring that your program can handle different valid inputs. Details of how the testing will be done is given below”
3.1. Test Case 1: Full Training from Zero Knowledge Input format:
n; future_weight; learning_rate; exploration_rate; obstacles= {(r1,c1), (r2,c2),…}
where:
NEED THIS PAPER? VISIT ESSAYLINK.NET OR WHATSAPP US OR CALL US AT +1-251-265-5102
For expert academic writing assistance, visit [EssayLink](https://essaylink.net/order) to achieve your goals.
: grid size, 4 ≤ 𝑛 ≤ 200
▪ future_weight: how much future outcomes matter
▪ learning_rate: how quickly the robot adapts to new data
▪ exploration_rate: how often the robot explores random moves
▪ obstacles: explicit set of obstacle coordinates
Requirements (output):
1. Run 1000 missions (episodes).
o Each mission ends when the exit is reached or after 𝑛2 steps are reached, whichever comes first.
o Energy: +1 for a normal move, +3 if landing cell is an obstacle.
2. Print the knowledge matrix: o At episode 0. o After every 100 episodes.
o At the final episode (1000).
3. After training ends, print:
o The best path from start to exit (as a sequence of coordinates). o The total energy consumed by that path.
3.2. Sample Input for Test Case 1
4 ; 0.9 ; 0.3 ; 0.2 ; obstacles={(0,1),(2,2)}
NEED THIS PAPER? VISIT ESSAYLINK.NET OR WHATSAPP US OR CALL US AT +1-251-265-5102
For expert academic writing assistance, visit [EssayLink](https://essaylink.net/order) to achieve your goals.
Sample Output for Test Case 1
Episode 0: Knowledge Matrix
[0, 0, 0, 0]
[0, 0, 0, 0]
[0, 0, 0, 0]
[0, 0, 0, 0]
Episode 100: Knowledge Matrix
[0.0, -0.5, -0.8, -1.0]
[-0.2, -0.7, -1.1, -0.3]
[-0.4, -0.9, -3.0, -0.8]
[-0.5, -0.6, -0.9, 10.0]
Episode 200: Knowledge Matrix
[0.0, -0.4, -0.7, -0.9]
[-0.1, -0.6, -0.9, -0.2]
[-0.3, -0.8, -3.0, -0.7]
[-0.4, -0.5, -0.8, 10.0]
… (continue for every 100 episodes)
Episode 1000: Knowledge Matrix
[0.0, -0.2, -0.4, -0.6]
[0.0, -0.3, -0.5, -0.2]
[-0.1, -0.6, -3.0, -0.3]
[-0.2, -0.4, -0.5, 10.0]
Best Path Found:
(0,0) → (1,0) → (2,0) → (3,0) → (3,1) → (3,2) → (3,3)
Energy Consumption of Path: 7
3.3. Test Case 2: Step-by-Step Updates Input format:
n |
obstacles={(r1,c1), |
(r2,c2),…}; |
where:
▪
: grid size, 4 ≤ 𝑛 ≤ 200
▪ future_weight: how much future outcomes matter
▪ learning_rate: how quickly the robot adapts to new data
▪ obstacles: explicit set of obstacle coordinates
▪ knowledge: row-major order sequence of size n² as shown:
0,1, 2, 3, … , 𝑛 − 1, 𝑛, … , 2𝑛 − 1, 2𝑛, … 3𝑛 − 1, 3𝑛 … , 𝑛2 − 𝑛, 𝑛2 − 1 followed by ‘;’ (Note: the first row is 1, 2, 3, … , 𝑛 − 1, the second row is 𝑛, … , 2𝑛 − 1, the third row is 2𝑛, … 3𝑛 − 1 and so on until the last row that start from 𝑛2 − 𝑛, 𝑛2 − 1)
▪ path: sequence of moves like 𝑆, 𝑆, 𝐸, 𝑊, 𝑁, 𝐸 Requirements:
1. The robot starts at (𝟎, 𝟎).
2. For each move:
o Use the provided future_weight and learning_rate when applying your update rule. o Print the cell visited, updated knowledge value, and cumulative energy.
o Energy: +1 for a normal move, +3 if landing cell is an obstacle. o If a move hits a wall, the robot stays in place, but energy is still consumed.
3. At the end, print the final knowledge matrix in grid form.
3.4. Sample Input for Test Case 2
4 ; 0.9 ; 0.3 ; 0.2 ; obstacles={(0,1),(2,2)} ; knowledge=5,12,3,7,8,0,10,15,6,14, 1,9,11,13,2,4 ; path=S,E,E,N
Sample Output for Test Case 2
Step 1: Move S → Cell (1,0), Updated Value=8.5, Cumulative Energy=1 Step 2: Move E → Cell (1,1), Updated Value=0.5, Cumulative Energy=2 Step 3: Move E → Cell (1,2), Updated Value=10.5, Cumulative Energy=3 Step 4: Move N → Cell (0,2), Updated Value=3.5, Cumulative Energy=4
Final Knowledge Matrix: [ 5, 12, 3.5, 7 ] [8.5, 0.5, 10.5, 15 ] [ 6, 14, 1, 9 ] [ 11, 13, 2, 4 ] |
3.5. How grading will run:
• Test Case 1 will be executed 3 times.
• Test Case 2 will be executed 10 times.
• All Inputs will be dynamically generated within constraints specified in the respective test case.
• Each student will receive different random inputs.
4. What’s Provided vs. What You Implement
Provided in the starter: seeding (`seed_from_student_id`), BFS reachability, grid environment (`MiniMineEnv`), and printing helpers. You implement only the TODOs (Q-table init, ε-greedy, Q-update, value grid, trainers/solvers, and parsers). Keep names/signatures unchanged.
5. What To Submit
You must submit the following:
1. `submission.py` — export your notebook (.ipynb → .py). Do not rename functions; keep print formats exact.
2. AI-disclosure PDF — file name exactly `#########_prompts.pdf` (your 9digit KFUPM ID, no 's'). Must include all prompts and outputs from generative AI. Must be text-based (not images).
6. Grading
The homework is worth 100 points.
• Test Case 1 (45 points total):
o Your program will be run 3 times with different random inputs.
o Each correct run is worth 15 points. o You must produce correct knowledge matrices, best path, and energy consumption for each run.
• Test Case 2 (55 points total):
o Your program will be run 10 times with different random inputs. o Each correct run is worth 5.5 points. o You must produce correct per-step updates and final matrix for each run.
• Missing/unreadable `#########_prompts.pdf`: 60% penalty (graded out of 40 instead of 100).
o Wrong filename error: −10 points.
• Write your own code; do not share solutions.
7. Autograder Notes
The autograder setup will be as follows:
• Runtime limit 10 minutes; memory 1 GB.
• Multiple executions with different random inputs within constraints.
• Seeding: set your `STUDENT_ID` and use seed_from_student_id(STUDENT_ID)` as in the starter.
8. Academic Integrity & AI Policy
Students are cautioned that a code-similarity scan will be conducted that compares all submissions. In addition, here are the generative AI related policy:
a. As stated in the syllabus:
• Submitting AI-generated solutions directly is considered cheating.
• It is prohibited to give the entire homework question to a generative AI and ask for a full solution.
• You can use Generative AI with disclosure .
b. Permitted use:
• You may use AI tools to break down the problem, brainstorm ideas, debug, or generate small code snippets, as long as you clearly understand and adapt the result.
• For every use, you must include the prompt, the AI’s output, and a short reflection in a file named prompts and results.pdf (must be text-based, not a scanned image).
c. Penalty: Not submitting the disclosure file or submitting unreadable material will result in a 60% penalty (graded out of 40 instead of 100).
Write your own code; do not share solutions.
9. Submission Checklist
• `STUDENT_ID` set in the starter
• All TODO functions implemented; names/signatures unchanged
• Output formatting matches spec (labels, spacing, rounding)
• `submission.py` runs without edits
• `#########_prompts.pdf` included, text-based, correct filename