Input lag in Psikyo games in MAME


There’s is a long-standing belief that MAME has high input lag for Psikyo games (especially earlier titles), so I decided to have a look.

Some people use an ancient MAME-fork called “Shmupmame“, since it has less input lag for these games. I assumed this was due to bugs in the MAME driver, so I started probing some PCBs of mine to figure out how they work.

The MAME drivers report 4 frames of lag on the earlier titles (which is also what I saw when emulating them), and while I did find some things worth adjusting in the MAME driver, these things don’t have any impact on latency. I did however find something else interesting…

Early Psikyo PCBs have quite a bit of lag!

So yeah, the TL;DR is that MAME is accurate, because the Early Psikyo PCBs also have the same amount of input lag. The relevant games are:

  • Sengoku Ace / Samurai Aces
  • Gunbird
  • Battle K-Road
  • Strikers 1945
  • Tengai / Sengoku Blade

“Frames of input lag” is a ambiguous term though, so to be clear:

  • Button/lever is pressed during frame 0
  • No sprite movement on frame 1
  • No sprite movement on frame 2
  • No sprite movement on frame 3
  • Sprite movement on frame 4!

Also note that when playing original hardware, the input lag will vary by up to one additional frame depending on the raster beam position of the screen at time of button press, since inputs are sampled once per frame. Additionally sprites further up on the screen will be updated quicker, since the beam scans up-to-down.

Here is two typical example of what it can look like on Tengai shot in 240fps.

Note that time from button press until bomb animation starts is the same as for lever input to player movement in this game (in some other games that is not true).

“Fast case” for Tengai

See images at:

  • Img 5: Button is pressed (see LED), right before it gets sampled.
  • Img 6-9: One frame of no sprite movement
  • Img 10-13: One frame of no sprite movement
  • Img 14-17: One frame of no sprite movement
  • Img 18: Bomb animation starts

“Slower case” for Tengai

See images at:

  • Img 5: Button is pressed right AFTER it gets sampled.
  • Img 6-8: Game still doesn’t know button was pressed
  • Img 9: Finally button is sampled
  • Img 10-13: One frame of no sprite movement
  • Img 14-17: One frame of no sprite movement
  • Img 18-21: One frame of no sprite movement
  • Img 22: Bomb animation starts

Tengai in MAME

In MAME, the easiest way to verify that its similar timing as on PCB is to do the following:

  • Bind buttons for “Pause” and “Pause – Single Step” (UI input menu)
  • Pause game while not inputting anything
  • Hold bomb button (or direction) when paused, and doing steps below
  • Pause Single Step (One frame of no sprite movement)
  • Pause Single Step (One frame of no sprite movement)
  • Pause Single Step (One frame of no sprite movement)
  • Pause Single Step (Animation starts)

Basically… everything seems fine.

What about later games?

Later games have much lower input latency on PCB. These include:

  • Sol Divide
  • Strikers 1945 II
  • Strikers 1945 III
  • Gunbird 2
  • Dragon Blaze

… but they do on MAME too.

When testing Strikers 1945 II on PCB, I get one frame of no sprite movement. MAME produces the same result as PCB if using the Pause+Step method described above.

Basically, the newer games have two frames less input lag on both PCB and MAME compared to the older games.

But shmupmame has less lag on the early games right?

Yes, shmupmame has less lag than MAME and original PCBs for early Psikyo games. It does this by shortcuts that are possible since some buffers can be safely skipped in emulation, if not worrying about emulating the hardware accurately. If you prefer to play these games like that, then that’s fine too. It’s just not how the games worked originally 🙂

Other stuff I fixed in the MAME Psikyo driver

Still made some solid improvements though.

  • Measured accurate HSync and VSync timings. Previously they were not correct.
  • Fixed various issues in documentations
  • Corrected vertical blanking interrupt level (was irq1, should be irq4)
  • Removed MACHINE_IMPERFECT_TIMING from all relevant machines, since they’re now verified to work correctly

For reference, correct timings are:

SYNCS:  HSync 15.700kHz, VSync 59.923Hz
   HSync most likely derived from 14.3181MHz OSC (divided by 912)
   262 lines per frame consisting of:
   - Visible lines: 224
   - VBlank lines: 38 (Front/Back porch: 15 lines, VSync: 8 lines)


MAME is fine and accurate for Psikyo games in terms of input latency, you don’t need old forks.


Research into CV1000 Blitter performance and behavior

I’ve spent some time in December looking into CV1000 Blitter behavior to figure out how it performs in terms of slowdown. I feel I have a good understanding of how it works now, and have put together a doc describing it.

View/Download it here: CV1000_Blitter_Research_by_buffi.pdf

Why do this?

The current simulation of this Blitter in MAME is quite impressive as a high-level reproduction, but there doesn’t seem to have been much time spent researching the timing of operations.

This document aims to document how the behavior and timing of the Blitter actually works, and people can utilize this to make something that’s mostly accurate.

Also it is very fun to attach a Logic Analyzer to a PCB and figuring out how it works.

Preemptively Answered Questions

Q: But what about tuning Blitter Delay in MAME
A: Trying to tune the existing Blitter Delay slider in MAME doesn’t really make any sense, since the slowdown introduced from it doesn’t have anything to do with how it works on real hardware. It’s still arguably better than no slowdown at all, which used to be the other option, but that’s about it.

Q: Will this make CV1000 emulation run with proper slowdown?
A: Probably not really. While this should make it possible to have the Blitter part of emulation more accurate, there’s still no emulation of SH-3 Wait States either, which means that slowdown that’s due to CPU not having time to finish processing before VBLANK due to waiting will still not be accurate. I have no idea how much this matters for most games.’

Q: How much work is it to implement this?
A: It should be very simple. And the simplest thing to do would be:

  • Rip out all the existing Blitter delay logic.
  • When sending a Command to start Blitter Operations, estimate the time they will take to compute.
  • Don’t return “Ready” for the Ready Requests until that time has passed.

This still doesn’t reflect how it’s performs on real hardware (where Operations are running concurrently with the CPU, and requesting new Operations when the existing ones are done executing), but in practice I don’t think that should really matter in terms of experienced gameplay performance.


If you have feedback on the document, or suggestions for further work, please reach out to me on Arcade-Project forums, Github or in comments on this blog.