Assignment 2:Pacman – Capture the FlagIn this assignment we will develop an AI controller for a multiplayer variant of Pacman. In this gametwo teams, red and blue, compete with each other to capture as many dots as possible from theThe opposite side of the map.Originally developed atUC Berkeley, this setup presents us with a challenging planning domainand an opportunity to solve a fun problem. At the end of the semester we will hold a contest to findout who developed the most effective controller. Glory and a fancy certificate await the top threestudents!Part 1: InstallationPiglet PDDL SolverIn this assignment you will use the piglet library to solve PDDL problems. This requires installingpiglet to the python environment:1. Go to piglet-public folder (If you use a virtual environment, activate the virtual env first. Youcould use your flatland virtual environment as well.)2. git fetch3. git checkout pddl_solver //if successful, you should find a pddl_solver.py underlib_piglet/utils/ folder4. python setup.py installCheck themyTeam.pystarter implementation to see how to use the newly installed library.The competition server will have the same piglet installed in the environment as a PDDL solver.FIT5222 – Planning and Automated ReasoningUpdate Pacman CodeMake sure you have the latest version of the Pacman CTF game environment:1. Go to the pacman-public folder (you cloned the repo in week 1)2. git fetch3. git reset –hard //in case any changes you made locally4. git pull5. If your code is newest, you should see staffTeam.py and berkeleyTeam.py in the repo.All codes for the game environment are now found in thepacmanfolder.Part 2: Game RulesWe can characterize Pacman CTF as follows:●Multi-agent:Two agents need to work together against an opposing team.●Discrete:The environment is a grid maze and time is discretised into unit time steps.●Dynamic:The environment changes as food is consumed.●Partially observable:Each agent has a limited sensing range.●Sequential:Past decisions affect future actions.●Deterministic:The current state depends only on the teams and their actions.●Offline:The world stops while an agent deliberates.● Known:All the rules of the game are available a priori.EnvironmentThe game map is divided into two halves: red (left) and blue (right). When on the red side, a redagent is a ghost. When crossing into enemy territory, the agent becomes a Pacman. Redagents must defend the red food while trying to eat the blue food.ScoringWhen a Pacman agent eats a food dot, that dot is removed from the board and placed in avirtual backpack. When the agent returns to their own side of the board, they automatically"deposit" the contents of their backpack, earning one point per dot. The Red team scorespositive points, while the Blue team scores negative points.Watch out!If an agent is caught by a ghost, before reaching their own side of the board, alldots in their backpack will be deposited back to their original positions on the board. The agentalso returns (as a ghost) to its original starting position. No points are scored in this case. Nopoints are awarded to the opposing team either (for catching Pacman).FIT5222 – Planning and Automated ReasoningPower CapsulesA small number of power capsules are scattered throughout the maze. These can be eaten bya Pacman agent. When this happens ghosts become “scared” and are susceptible to beingcaught by the powered up Pacman. This ghost effect lasts for the next 40 timesteps, or untilcaught, whichever comes sooner.ObservationsAgents can only observe an opponent's configuration (position and direction) if they or theirteammate are within 5 squares (Manhattan distance). In addition, an agent always gets a noisydistance reading for each agent on the board, which can be used to reason about unobservedopponents. The reading has an error of +/- 6 steps from the true distance. A general direction(toward the opponent) is not specified.WinningA game ends when one team returns all but two of the opponents' dots. Games are also limitedto 300 time steps (ie, 300 moves for each of the four agents). This equates to 1200 totalactions, seen in the visualiser. If this move limit is reached, whichever team has returned themost food wins. If the score is zero (ie, tied) this is recorded as a tie game.Part 3: Task & Student WorkflowAt every time step the simulator will present you with the current state of the game board. Your taskwill be to reason over this current state and decide how your agents should react.This will require you to think about the planning process in several different ways: at the high-level,where you decide what strategy your team will pursue; at the low-level, where you transform thatstrategy into a concrete sequence of low-level actions; and you will need to decide when to replanyour agents, in response to changing circumstances.We illustrate an example of thestudent workflowIn the diagram below.FIT5222 – Planning and Automated ReasoningYour starting point for modifying this workflow is the functionchooseActionin filemyTeam.py.This function is called by the simulator at every time step and for each agent. The simulatorprovides the agent with a description of the game board. At the end of the function you must returna concrete action for the agent to execute in the next timestep.How to decide which action the agent should take is entirely up to you. The starting code providesa basic template for the decision-making process and some simple, though not very effective,strategies. You will need to modify this code to create a winning AI.The next sections give further details about the decision-making process. For more detailsregarding the Pacman CTF code, themyTeam.pyimplementation, and other general tips, refer tothe additional documentAssignment 2 Code Documentation.pdf"!3.1. Choose a High Level GoalWhat is the high-level strategy that agents should pursue? For example, they could visit theopposing side and try to score points, or they could stay home, and defend their score. Each agentchooses its own high level goal, but that decision may be influenced by the current state of theboard, the current score, and the strategy and position of the teammate agent.Your starting point here is thegetGoalsfunction in the filemyTeam.pyGiven a PDDL descriptionof the current state, this function returns positive and negative PDDL state descriptions thattogether describe a high-level goal. You can create multiple different goals and then choosebetween them based on the information in the current state.FIT5222 – Planning and Automated Reasoning3.2. Generate a High Level PlanTo achieve your high-level goals you will need to compute a corresponding sequence of high-levelactions (ie, a plan). Given a PDDL description of the initial (ie, the current) states and the goalstates, the functiongetHighLevelPlancomputes such a plan. You do not need to modify thisfunction. Just call it as necessary.But what if the agent already has a high-level plan, computed in a previous iteration? It's up to youto decide how to proceed. The agent could retain the existing plan or it could change, and replananew from the current state. For example, if the pre-conditions for the next action are no longersatisfied, the agent will certainly need to generate a new high level plan. The agent may alsodecide to revise its current plan (and possibly current goal!) if its execution produces anunexpected result (eg, being caught by a ghost or a powered-up Pacman).The staff implementation checks if we can reuse existing plan:Note:Only a limited set of predicates are included in the basic PDDL state description (see the filemyTeam.pddl) and not all of them appear in the given actions. You can create a richer model, andcompute more diverse plans, by (1) refining existing actions or creating new ones that use more ofthe available predicates; (2) enhancing the PDDL state description with additional information fromthe current game state (ie, add more predicates). We provide useful programmatic facilities forFIT5222 – Planning and Automated ReasoningThis purpose. See theget_pddl_statefunction for examples of how to collect information from thesimulator environment (gameState).3.3. Generate a Low Level PlanHigh level plans tell the agents what to do, but in broad strokes. For example, if the goal is to scorepoints, a necessary action is to move to the opposite side of the board. But how should the agentget there? Once on the opposing side, the next high level action might be to eat a food dot. Butwhich one? Do I want to move left, right, up, or down? Low level planning provides answers toThese questions.The functiongetLowLevelPlanis your starting point here. Its inputs are a high-level action and adescription of the current game state. The function returns a sequence of move actions that theagent needs to take to achieve the high-level effects of the input action.You are free to implement this function using any method you choose. For example:●Heuristic Search Strategy: you can develop your own custom strategies by implementinga new expander class for your own customized state-space representation.●Learning Based Strategy: we provide a very basic example of this approach. You canmake your own implementation or improve it by introducing more features to the model.Then you will train the new model to obtain a set of good weights. You should carefullyconsider, for each high level action, how you will reward or penalise the agent for itsperformance. You may need to keep adjusting your implementation, observe how youragent behaves, and try something new to overcome the drawbacks you observed.But what if the agent already has a low-level plan, computed during a previous iteration? Again, it'sup to you to decide how to proceed. The agent could continue following the existing low-level planor it might decide to revisit its decisions on account of new information provided by the game state.For example, if Pacman is heading towards a food dot, but observes a ghost up ahead, continuingtoward the dot may be a bad idea.The staff implementation reuses low level plans:FIT5222 – Planning and Automated ReasoningRemember!There are many different ways to achieve a high-level action. You may also wish toconsider whether your two agents coordinate their low-level actions or act independently of oneanother.3.4. Execute MovesYour low-level strategy has generated a sequence of directions for the agent to move in. The firstmove in this plan is given to the simulator for execution. Each of your two agents, and the twoOpposing agents, move according to their own instructions and the timestep is advanced by one.All of this happens after the completion of thechooseActionfunction.Based on the new state of the game board, you may need to re-evaluate your goals and plans.This workflow is cyclic and continues until the game is over.Remember:You can (and should!) compare your implementation (myTeam.pyagainst theexisting baselines (staffTeam.pyorberkeleyTeam.py). You can find these implementations in thepacmanfolder. The baselines are useful to check that your implementation works and to measureYour progress.Part 4: Competition SetupWe will have a competition to see who has the strongest pacman AI. You will be able to submityour implementation to a contest server where they will be run against each other.Upon submission to the server your agent will be evaluated against a staff baseline implementation(staffTeam.py). Your agent will be evaluated for 49 games on 7 different maps. Within each gamethere will be a time limit and the agent that has succeeded in eating the most food during that timewill win that game. To win the match with staff baseline, your team must win convincingly: 28 out of49 games.By winning the staffTeam.py, you will get a pass for Criteria 1 competition score (see more detailsin marking rubric), have a position on the leaderboard, and your agent will be evaluated against allother participants on the leaderboard to get additional victory points.With each opponent on the leaderboard (excluding the staff baseline implementation), your agentwill be evaluated on the same set of problems:● In the event of a convincing win (win at least 28 out of 49 games), you will get 3 victorypoints.● In the event of a tie (win less than 28 games, but lose less than 28 games as well,considering tie games neither win nor lost.), you will get 0.5 victory points.● In the event of a loss (lose 28 games or more), you will get 0 victory points.If you win the staff baseline implementation, your marks for Criteria 1 competition will be decidedby (Your total victory points / All available victory points)%, where the all available victory points is3 × (number of all participants – 1).FIT5222 – Planning and Automated ReasoningYou are expected toclearly win against the staff baseline implementationOtherwise you willnot proceed to the next stage of competing with entries from other students. In this case yourgrade for Criteria 1 of the marking rubric will be a Fail.The server will keep a record of your matches against every opponent on the leaderboard. Whensomeone implements a new personal-best record, your score against that student will be updatedas well.More details regarding the competition setup will be provided in a supplementary document"Assignment 2 Contest Submission Instruction.pdf".Part 5: ReportTogether with your implementation source code you will be required to submit a report thatdescribes your approach. You will need to write a description of your strategy and give ajustification for algorithmic choices. Make sure that your report is as detailed and complete aspossible, including appropriate references.There will be a 15 page limit on your report (not(including references)Minimum font size of 12 pt (Times New Roman or Arial).Markers will stopreading after the page limit in which case your report may be marked as incomplete.When describing your model you should explain not only what it does but also analyze its strengthsand weaknesses (eg, complexity, guarantees etc). You should also discuss how your codeimplements that model: for example, important data structures, necessary optimisations andparameter choices. Justify your design and implementation choices with discussion and, whereappropriate, with experiments.You can ask questions about the assignment and you can discuss the merits of different designschoices (e.g., on Ed).HoweverAll submitted work must becompletelyYour own. Do not copy.implementations you happen to find online. Do not ask your friends or colleagues for theirimplementations. You will need to work independently and take reasonable steps to make sureothers' don't copy or misuse your work. For more information please seehttps://www.monash.edu/students/study-support/academic-integrityPLAGIARISM IS NOT ACCEPTABLE.YOU WILL FAIL THE ASSIGNMENT AND POSSIBLY THE ENTIRE UNIT.FIT5222 – Planning and Automated ReasoningPart 6: Marking RubricThe rubric specifies the marking criteria for your implementation and report. To avoid disappointment, pay close attention to the requirements.The weight of the implementation is 66.6% while the weight of the report is 33.3%.Marking Rubric – Implementation (100/150 marks)CriteriaN0%-49%P50%-59%C60%-69%D70%-79%HD80%-100%1. Competition(50 marks)Loses to the staffbaselineimplementationOutranks staff baselineimplementation.Get 25% ~ 49% of allAvailable victory points.Get 50% ~ 74% of allAvailable victory points.Get 75% ~ 100% of allAvailable victory points.2. Agent strategyimplementation(50 marks)Minor updates toexample PDDLimplementation inthe code base (forexample, onlychanging parametervalues)Addresses somedrawbacks in existingimplementation.Improves the model andHigh-level planner by:(1) Introducing new actionsto the PDDL model, whichuses more listed predicatesor new predicates youintroduced. Or(2) Design new goalfunctions.Same as previous criteriaplus:Implements alternative highlevel goal prioritisationscheme (agents switchgoals based on new(observations)Improves low-level planner.e.g., more features, betterreward functions, or usesheuristic search for lowLevel planning.Same as previousCriteria plus:Customized low levelplans for distinct highLevel actions; e.g.different low-levelstrategies for eating food(which?) vs. go toThe opposing side (where?).Same as previous criteriaplus:High level and low levelDecisions take into accountthe current action andstrategy of the teammateagent. (Two agents shouldcooperate with each otherand share information witheach other.)Or design your own modeland agent workflow.FIT5222 – Planning and Automated ReasoningMarking Rubric – Report (50/150 marks)CriteriaN0%-49%P50%-59%C60%-69%D70%-79%HD80%-100%ApproachDescription(15 marks)Incomplete orinsufficient descriptionof the approach. Doesnot demonstratesufficientunderstanding of theimplementedapproach and relatedalgorithms. Little or nopseudo-code.There is a description of theapproach, but the approach isgenerally unclear. Little or noEvidence to support thediscussion. The discussionThis demonstrates a limitedunderstanding of theImplementation approach andrelated algorithms.Should describe and discussHow the new implementationImprovements upon the staffbaseline, staffTeam.pyThere is a description of theapproach that can be mostlyunderstood, except in respectof small details.Provide some motivation foryour implementation choicesand link the agent strategy tolectures and tutorials.There is limited evidence anda limited argument justifyingThe choice of approach.Discussion demonstrates abaseline understanding of theImplementation approach andrelated algorithms.There is a description ofthe customized low levelplans and actions that canbe fully understood.There is a clear butsomewhat flawedArgument justifying themethodology.The approach is supported byevidence that is somewhatrelevant. The discussiondemonstrates goodunderstanding of theimplemented approachand related algorithms, butwith minor issues.There is a description of theapproach that can be fullyunderstood. The quality of thejustification of the methodologyis supported by strongevidence. The discussiondemonstrates excellentunderstanding of theImplementation approach andrelated algorithms.Provide clear and theoreticalmotivation for yourimplementation choices anddirectly link the agent strategyto lectures and tutorials.ExperimentDiscussion (15marks)Provides little or nodiscussion ofexperimental work.Results are missing.incomplete, orunsupported byevidence. NoProvides a basic discussion ofexperimental results withLimited evidence. SomeNumerical details may beincluded but are incomplete orlack context. Mentionsefficiency of implementation inPresents a sound discussionof experiments withSupporting numerical details.Results are generally clearand partially analyzed forefficiency. Some attempt toevaluate implementationsProvides a strong andwell-supported discussionof experiments, comparingat least 2 implementedstrategies/attempts.Numerical evidence isDemonstrates a comprehensiveand insightful discussion ofexperiments. Numerical resultsare detailed, accurate, and fullyintegrated into the analysis,comparing at least 3implementedFIT5222 – Planning and Automated ReasoningNumerical details arepresented, andefficiency orimplementationPerformance is notaddressed. Workshows major flaws ormisunderstanding ofthe experimentalprocess.vague terms, withoutmeaningful analysis orcomparison. DiscussionShows only minimalengagement with experimentsperformed.against benchmarks orReferences are made, butanalysis may be superficial orinconsistent. Shows adequateunderstanding of theexperimental process.clearly presented andused to evaluate efficiencyof implementations.Comparative analysisagainst standardbenchmarks and/orappropriate referenceAlgorithms are evident andaccurate. Discussiondemonstrates criticalengagement, though someaspects may lack fulldepth.strategies/attempts.Efficiency of implementations isRigorously assessed onstandard benchmarks andCompared effectively toappropriate referencealgorithms. Evidence isconsistently used to supportclaims, and discussion reflectexceptional depth, precisionand critical evaluation.Reflections (10marks)Lacks critical thinkingand original insights.Discussion of thestrengths andweaknesses of theimplementedapproach are absentor irrelevant.Limited critical thinking andFew original insights.Conclusions are vague orPoorly articulated, but touchupon some strengths andweaknesses of theimplemented approach.Demonstrates basic criticalthinking with limited insights.Provides some conclusions.but misses key implicationsrelevant to the implementationapproach.Shows good criticalthinking with somereflections on thestrengths and weaknessesof implementationapproaches. ProvidesclearDiscussion of implications,But misses some keypoints.Demonstrates strong criticalthinking and providing insightfulreflections on the strengths andweaknesses of implementationapproaches. Clearly articulatesthe implications of the findings.Communicationskills (10 marks)Hard to follow with noClear narrative.Inadequate or noseparation ofdiscussion text intocoherent sections.The writing has a tenuouslyLogical narrative. Someattempt at the expectedstructural elements (e.g.)(Intro, conclusion)The writing is inaccurate orArticulate most of the time.The text has a clear logicalnarrative and expectedstructural elements (eg intro,(conclusion)The writing is inaccurate orArticulate most of the time.The writing is wellComposed and has a clearand logical narrative and isWell structured.Writing is generallyaccurate and articulate.The writing is very wellComposed and has a very clearand logically-formed narrativeas a whole.Writing is accurate andarticulate.FIT5222 – Planning and Automated ReasoningWriting is not accurateor articulate.Inadequate supportingmaterials.Inadequate or missingreferencing.The document has fewsupporting materials (tables,(images, pseudo-code)The student hasattempted to undertakeciting and referencingwith frequent errors.There is some supportingmaterials (tables, images,(pseudo-code) but not wellintegrated with the rest of thetext.The student follows theRequirements for citing andreferencing, withSome notable errors.The document hasappropriate supportingmaterials that are wellintegrated with the rest ofthe text.The student follows theRequirements for citingand referencing, withSome minor errors.The document is expertlystructured in the style of ascientific report, includingappropriate supportingmaterials that clearly improvethe quality of associatedDiscussion.The student follows theRequirements for citing andreferencing.FIT5222 – Planning and Automated ReasoningPart 7: Submission GuideSubmission to Moodle:The following submission items are required:1. Your implementation source codes, in a single directory called "src" (you can copyeverything in the pacman folder to "src"). Zip the codes directory with file namelast_name_student_id_pacman.zip. (For example, Chen_123456_pacman.zip)2. The report describing your approaches as a single pdf file. Name the pdf aslast_name_student_id_report_pacman.pdf. (For example,Chen_123456_report_pacman.pdf)Submission to Contest Server: