cspang1 / jcap Goto Github PK

JAMMA Custom Arcade Project

License: MIT License

Propeller Spin 99.07% Shell 0.93%

asm propeller-spin game-engine arcade-machine parallax-propeller parallax-p8x32a-microcontroller graphics-engine verilog-hdl game-development jamma

jcap's Issues

Refactoring of VGA engine set-up for VGA video driver

Configuration values are currently hard-coded into VGA driver; need to calculate these values in Hub's Spin and/or Cog's PASM code.

Hardcode graphic resources memory model

Currently, the memory model for graphics resources (e.g. tile map, tile palette map, etc.) is arbitrary and each set of elements is addressed individually. This is flexible but introduces unnecessary complexity when implementing inter-propeller communication of the graphics resources.

The solution is to allocate explicit regions of memory for these resources, so the entire memory model as a whole can be transmitted efficiently.

ADDT'L: Modify transfer routine to accept parameterized buffer size, as games will rarely need entire graphics memory.

Optimize render system tile parsing via counter B trick

As discussed here.

Restarting TX cog occasionally causes TX/RX interconnection to fail/hang

Usually, if the TX cog is restarted during transmission, the RX cog will re-acquire its data stream and continue operating as if nothing had happened.

Occasionally however, the connection will fail altogether and no amount of TX cog restarts will cause a re-establishment of the data stream, and an RX then TX cog restart is required.

Refactor multicog VGA driver to re-utilize initialization code as buffer space

Chip's high-res VGA driver recycles initialization code to allocate space for the scanline buffer. With a max horizontal tile resolution of 40 tiles, the combined pixel and color buffers for four lines will be 40 * 4 * 2 = 320 longs. With 496 longs per cog available for instructions and variables, this leaves 496 - 320 = 176 longs available for the VGA driver. Re-utilizing initialization space, as well as general refactoring of the driver, should be adequate to fit everything.

Horizontal mirroring broken at left side wrap

Screen-out at leftmost bound is reversed when mirrored

Improve 1 microsecond wait subroutine accuracy

Currently the wait subroutine exhibits ~.1 us (10%) error.

TX/RX Routine failing after first full buffer transfer

Discussed here

Refactor VGA driver code to iteratively assign internal registers from main RAM

Currently, all registers in the driver cog which need to be populated with values calculated in the hub are done so one at a time in what's essentially an unrolled loop. Need to determine if there is a way to set them all in a single tight loop.

Add hook to repo to fix ModelSim project file absolute paths

https://stackoverflow.com/questions/5623208/how-to-execute-a-command-right-after-a-fetch-or-pull-command-in-git

Quartus .gitignore removes top-level .tdf AHDL file

Allegedly it's considered a "generated file".

Potentially refactor Input driver to use counter module for 74HC165 polling

apin for clock, bpin for parallel load?

Develop dual-Propeller "display" routine

https://github.com/cspang1/JCAP/projects/1#card-6846453

Unroll loops in vga_render system

Execution time of critical code sections (e.g. tile palette and sprite parsing) can be reduced by unrolling loops.

Fix README HTML markup

Add project architecture discussion and block diagram to wiki

Create TX routine init watchdog

Currently there's no system in place to ensure the RX routine is listening before the TX routine begins a connection attempt. Need to add a watchdog system in the start method.

RX cog can miss TX ACK given worst-case hub access times

Fix is to add at least 3 nop instructions here.

Refactor graphics and VGA interaction to clean up parameters

Right now the VGA driver instantiation from the graphics system is trash; passing completely random parameters like VSync attributes and tile line to scanline ratios. Solution is to pass a pointer to the base of them and perform a longmove. Only parameters passed then will be a base pointer to the graphics resources and a base pointer to cog initialization attributes. Consideration needs to be made about exactly what should be calculated in the Graphics vs VGA object. A lot of the calculations will overlap between VGA and RGBS, so it would be poor design to calculate the same attributes separately in the individual video drivers.

Some attributes are used directly by the video driver cog(s) via rdlongs, while others are set just before cog instantiation because they're different depending on the cog (e.g. different vertical porch sizes for interlacing). Theoretically you could set ALL attributes in Spin before cog initialization, which would free up a bunch of space currently used to res and long variables, as well as remove the need to copy memory from the hub to the cog. This however greatly obfuscates the code. Right now this is being done in a couple places it doesn't need to be.

Ultimately too much high-level code styling is being considered as the real core of the code is low-level. The added consideration of cog code size is a driving factor.

Graphics driver can not be started again after being stopped

Calling graphics.start after graphics.stop (or calling graphics.start while driver is already running) results in no video.

Add content to Propeller 1 wiki page

Replace placeholder text with information on Propeller 1 microcontroller, how it works, etc.

Moving sprite tearing when max sprites per line reached

When max sprites per line reached, and a sprite is being "erased" due to a higher priority sprite, there is a horizontal tearing being seen.

Develop dual-Propeller "render" routine

https://github.com/cspang1/JCAP/projects/1#card-6846453

Increase delay between multicog VGA driver cog initializations

Currently cogs are launched too rapidly to allow time for the porch sizes to be changed. Fix identified by @konimaru.

Add Progress Log to repo

On Surface desktop. Integrate with Wiki?

Implement Sprite rendering into VGA driver

Sprite rendering feature necessary for sprite-based graphics.

Can't set tile dimensions which don't wholly fit into pixel dimensions

Explained here: http://forums.parallax.com/discussion/167603/displaying-nxm-pixel-tiles-when-not-divisible-by-tile-scanline-height#latest

Need to remove 'D' from "74HC165D" in repo

The 'D' simply refers to the form factor of the 74HC165 IC, and isn't necessary.

Locks improperly implemented in vga_render due to incorrect addressing

vga_render.spin

  long  var_addr_base_          ' Variable for pointer to base address of Main RAM variables
  byte  cog_sem_                ' Cog semaphore
  long  start_line_             ' Variable for start of cog line rendering

This being in a VAR section will actually place cog_sem_ after start_line_ (first all longs, then all words followed by all bytes). In a DAT section the order remains but alignment is forced (3 unused bytes between cog_sem_ and start_line_).

Anyway, I'd simply pick a long and that's this problem out of the way. Next issue is that you actually have to read the value, right now you use its assumed address, e.g.

  long  var_addr_base_          ' Variable for pointer to base address of Main RAM variables
  long  cog_sem_                ' Cog semaphore
  long  start_line_             ' Variable for start of cog line rendering


        rdlong          clptr,  par             ' Initialize pointer to current scanline
        add             semptr, par             ' Initialize pointer to semaphore
        add             ilptr,  par             ' Initialize pointer to initial scanline
        rdbyte          sem, semptr             ' Get semaphore ID

:lock   lockset         sem wc                  ' Attempt to lock semaphore


semptr  long    4       ' Pointer to location of semaphore in Main RAM
ilptr   long    8       ' Pointer to location of initial scanline in Main RAM

sem     res     1       ' semaphore (or reuse semptr)

ATM it works because all cogs have the same value in semptr (par+1) of which the lowest 3 bits are used as a lock ID (%-01, locks don't have to be checked out to be used).

Add "tilt" to input system

Could add tilt signal to DS of 2nd 74HC165?

Modify render system to support 16-color tiles

Essentially will require having each line of a palette tile represented by a long (8 pixels, 32/8=4 bits of color each, 2^4=16 colors). Then, making simple modifications to color palette indexing and palette tile parsing in vga_render routine. 16-color sprites will be implemented in the same fashion.

Develop dual-Propeller "transfer" routine

For transferring sprite -> tile -> colors -> tile map from CPU to GPU for rendering and display.

Refactor VGA driver to perform interlacing

Current line of effort on Issue #20 involves interlacing scanlines via n separate cogs, where each cog renders and displays alternating scanlines e.g. 4 cogs render every fourth line.

74HC165 inputs out of order on WaveForms

Likely due to pins connected in wrong configuration from logic analyzer.

Parameterize size of VGA driver scanline buffer(s)

Issue discussed here.

Remove delays in input.spin

Delays are redundant due to the fact that each instruction takes at least 4 clock cycles, and with 2 instructions to toggle, that's 1 second/(80,000,000/4/2) = 100 nanoseconds minimum between pin toggles, well above the timing requirements for the 74HC165.

[request] compatibility test

Seeing that you seem to have P1V hardware available (FPGA), could you please test the following fragment and report the result?

                org     0

test            jmpret  $, #setup

' light up an LED here or have some indication that the board is running.

                waitpeq $, #0

                hubop   vscl, #%10000_000
setup           mov     vscl, $-1
                jmp     #$1FF

Expected result (based on P1 evidence) is that the running indicator should be active.

The P1 has the odd quirk that instructions at address $1FF are cancelled. So in the program above, the REBOOT placed there should be ignored and execution should continue at address 0 falling through to 1 where e.g. an LED is lit followed by an endless wait(peq).

OTOH, if the REBOOT does fire, the running indicator won't be reached.

Perform incrementally misdirected memory access to narrow down tile/color palette VGA issue

VGA video dropping entirely with current wip/video/vga vga.spin code as of commit bca37a7.

Start by trying to directly access color palette and go from there.

Overhaul README including graphics

Create logo and use Markdownify README as template.

Need Quartus .gitignore file

Seems to be a bit complex due to Quartus's complicated structure.

Horizontal sprite tearing occurring on first row of tiles

Carry-over from issue #45.

~~Tearing occurs only around scanline 5/6, and behaves strangely:~~
~~Scanline 5 will have its left two pixels cut off~~
~~Scanline 6 will be shifted left one pixel~~

~~Can actually cause video dropout and Propeller restart by rapidly moving the sprite up and down past this scanline!~~

More details forthcoming.

Scanline buffer being populated out of bounds

Currently, scanline buffer locations over slbuff+79 are being written to errantly when sprites move past the right edge of the visible screen. This is causing tile pointers to be overwritten.

A check for slboff > 79 needs to be performed before parsing with a conditional jump to the next pixel.

Need to further optimize render routine to hit 64-sprite SAT

Target size for the sprite attribute table (SAT) is 64 sprites, with a maximum of 8 sprites per scanline.

Right now with the worst case scenario of sprites which have zero transparent pixels: I'm missing the scanline deadline at all 64 sprites and 8 sprites on a line. However, incremental testing has shown that I'm only missing by a factor of a sprite or two per line and ~10 sprites on the screen.

Execution time needs to be shaved off either in the tile rendering section (already unrolled loops) or the sprite visibility check area.

@konimaru I may need your magic on this one. Since I've unrolled the tile parsing loops I don't believe the Counter-B trick we discussed before is applicable.

EDIT: Bumping up the number of rendering cogs from 5 to 6 does the trick and then some (and will give me some time to implement background and (potentially) parallax scrolling). This means I will have to overwrite the interpreter code in cog 0 of the PPU from PASM. In the meantime, I still want to shave off as much time as possible @ 5 cogs.

Implement sprites into render system

Currently only static tiles are rendered.

Investigate multi-cog vs dedicated PPU video generation approach

Current driver has to render faster than it can load the next tile/color at high tile dimensions, and adding the sprite rendering feature will only exacerbate this. The benefits of spreading the rendering of each line over multiple cogs vs having a dedicated PPU with screen buffer need to be investigated.

Implement a daisy-chain of 74HC165 modules on the DE0-Nano

Add another 74HC165 module to the top-level AHDL top.tdf and connect it to the original module as well as the necessary I/O pins.

VGA driver tile and color retrieval refactor

While probably not terribly important right now I'd suggest a minor change to this:

        shl             ti,     tOffset         ' Multiply tile index by size of tile map
        add             ti,     tpbase          ' Increment tile index to correct line
        add             ti,     tpptr           ' Add tile palette pointer to tile index to specify row of tile to be displayed
rdtile  rdlong          tile,   ti              ' Read 16-pixel-wide tile from Main RAM
        shl             ci,     #2              ' Multiply color index by size of color palette
        add             ci,     cpbase          ' Increment color index to correct palette
        rdlong          colors, ci              ' Read tile from Main RAM
movp    mov             0-0,    tile            ' Store tile row to pixel buffer        
movc    mov             0-0,    colors          ' Store color palette to color buffer
        add             movp,   incDest         ' Increment tile buffer pointer
        add             movc,   incDest         ' Increment color buffer pointer

Here tile and colors are loaded from hub then transferred to a cog location (2 steps). This could easily be done in a single step e.g.

        shl             ti,     tOffset         ' Multiply tile index by size of tile map
        add             ti,     tpbase          ' Increment tile index to correct line
        add             ti,     tpptr           ' Add tile palette pointer to tile index to specify row of tile to be displayed
rdtile
movp    rdlong          0-0,    ti              ' Read 16-pixel-wide tile from Main RAM
        shl             ci,     #2              ' Multiply color index by size of color palette
        add             ci,     cpbase          ' Increment color index to correct palette
movc    rdlong          0-0,    ci              ' Read tile from Main RAM
        add             movp,   incDest         ' Increment tile buffer pointer
        add             movc,   incDest         ' Increment color buffer pointer

cspang1 / jcap Goto Github PK

jcap's Issues

Recommend Projects

Recommend Topics

Recommend Org