Hello, if you have any need, please feel free to consult us, this is my wechat: wx91due
6CCS3AIN
Coursework
1 Introduction
Read all these instructions before starting.
This exercise will be assessed.
2 Getting started
Version 6 of api.py, further extends what Pacman can know about the world. In addition to knowing the location of all the objects in the world (walls, food, capsules, ghosts), Pacman can now see what state the ghosts are in, and so can decide whether they have to be avoided or not.
3 What you need to do
3.1 Write code
- A finite set of states S;
- A finite set of actions A;
- A state-transition function P(s 0 |s, a);
- A reward function R;
- A discount factor γ ∈ [0, 1];
Following this you can then compute the action to take, either via Value Iteration, Policy Iteration or Modified Policy Iteration. It is expected that you will correctly implement such a solver and optimize the choice of the parameters. There is a (rather familiar) skeleton piece of code to take as your starting point in the file mdpAgents.py. This code defines the class MDPAgent.
There are two main aims for your code:
To win games, Pacman has to be able to eat all the food. In this coursework, for these objectives, “winning” just means getting the environment to report a win. Score is irrelevant.
3.1.1 Getting Excellence points
A couple of things to be noted. Let W be the set of games won, i.e., |W| ∈ [0, 25]. For any won game i ∈ W define sw(i) to be the score obtained in game/run i.
Losses count as 0 score and are not considered. If ∆Se < 0, we set it to 0 (you cannot have a negative excellence score difference).
- Because smallGrid does not have room for score improvement, we will only look at the mediumClassic layout
- You can still get excellence points if your code performs poorly in the number of wins; marking points are assigned independently in the two sections
- Note however that marking points are assigned such that it is not convenient for you to directly aim for a higher average winning score without securing previous sections’s aims (a) and (b) first
- We will use the same runs in mediumClassic to derive the marks for Table 2 and Table 3.
3.2 Things to bear in mind
(b) We will evaluate whether your code can win games in mediumClassic by running:
(c) The time limit for evlauation is 25 minute for mediumClassic and 5 minutes for small grid. It will run on a high performance computer with 26 cores and 192 Gb of RAM. The time constraints are chosen after repeated practical experience and reflect a fair time bound.
(d) When using the -n option to run multiple games, the same agent (the same instance of MDPAgent.py) is run in all the games. That means you might need to change the values of some of the state variables that control Pacman’s behaviour in between games. You can do that using the final() function.
3.3 Limitations
Code written in Python 3.X is unlikely to run with the clean copy of pacman-cw that we will test it against. If is doesn’t run, you will lose marks.
(b) Your code must only interact with the Pacman environment by making calls through functions in Version 6 of api.py. Code that finds other ways to access information about the environment will lose marks.
The idea here is to have everyone solve the same task, and have that task explore issues with non-deterministic actions.
(c) You are not allowed to modify any of the files in pacman-cw.zip except mdpAgents.py.
Similar to the previous point, the idea is that everyone solves the same problem — you can’t change the problem by modifying the base code that runs the Pacman environment. Therefore, you are not allowed to modify the api.py file.
(d) You are not allowed to copy, without credit, code that you might get from other students or find lying around on the Internet. We will be checking.
This is the usual plagiarism statement. When you submit work to be marked, you should only seek to get credit for work you have done yourself. When the work you are submitting is code, you can use code that other people wrote, but you have to say clearly that the other person wrote it — you do that by putting in a comment that says who wrote it. That way we can adjust your mark to take account of the work that you didn’t do.
This is to ensure that your MDP-solver is the thing that can win enough games to pass the functionality test.
4 What you have to hand in
Your submission should consist of a single ZIP file. (KEATS will be configured to only accept a single file.) This ZIP file must include a single Python .py file (your code).
To streamline the marking of the coursework, you must put all your code in one file, and this file must be called mdpAgents.py,
Submissions that do not follow these instructions will lose marks. That includes submissions which are RAR files. RAR is not ZIP.
5 How your work will be marked
We will test your code by running your .py file against a clean copy of pacman-cw. As discussed above, the number of games you win determines the number of marks you get. Since we will check it this way, you may want to reset any internal state in your agent using final() (see Section 3.2). For the excellence marks, we will look at the winning scores for the mediumClassic layout.
Since we have a lot of coursework to mark, we will limit how long your code has to demonstrate that it can win. We will terminate the run of the 25smallGrid games after 5 minutes, and will terminate the run of the 25 mediumClassic games after 25 minutes. If your code has failed to win enough games within these times, we will mark it as if it lost. Note that we will use the -q command, which runs Pacman without the interface, to speed things up.
(b) Code not written in Python will not be marked.
(c) Code that does not run in our test setting will receive 0 marks. Regardless of the reason.
(d) We will release the random seed that we use for marking. Say the seed is 42, then you need to do the following to verify our marking is correct:
1) fix the random seed to 42 (int, not string type) at line 541 of pacman.py. (not ’42’)2) download a fresh copy of the new api (to avoid using files you modified yourself)3) run python pacman.py -q -f -n 25 -p MDPAgent -l mediumClassic4) you should get the same result as us. If not repeat step 3) again. Should the outcome be different, then you didn’t fix the random seed correctly. Go back to 1)