Making a Minecraft clone that runs on the Teensy 4.1 in the Arduino environment.
To make the experience fit your profile, pick a username and tell us what interests you.
We found and based on your interests.
ideal_worldgen.pyPrototype procedural world generator in Python that uses tectonic plate theory. Yes, it is actually procedural. See project logs for details/theory.x-python - 8.88 kB - 05/22/2024 at 15:06 |
|
|
teensy3d_original_tests.zipZip Archive - 53.21 kB - 05/19/2024 at 20:32 |
|
I managed to get it to work with a monitor using my #RA8875 VGA card, and it is working rather well:
I'm getting around 20-30 FPS, depending on where I point the camera. With the RA8875, I can use 8-bit color, so the colors look a lot better than have previously been shown on this project. The cost, is that there is no longer any transparency, but this isn't noticeable yet. It took me a bit to get the new color system working, so the above picture is taken after I fixed the color issues, but the video below does not have them fixed yet. It was related to how the color data is saved on the SD card to make it easier to load.
The second half of the video largely just describes how the rendering engine works - if you follow this project, then you probably already know it. The main thing I've done is port my #NTIOS (Arduino OS) RA8875 driver to the Arduino Minecraft codebase - I'm not using NTIOS for this project because I need to keep resource consumption to an absolute minimum for this limited hardware.
Processed 1035 blocks of 16384 total blocks. Total time: 43.820000ms Render time: 43.351000ms Time taken for 100 frames: 4412ms Time per frame: 44120.000us FPS: 22.7
Less than half a millisecond - ore just more than 1% of the render time. About 98.5% of the render time is spent on the 3D rendering code, and of that probably about 99% is spent on computing and drawing individual pixels - not even matrix multiplication or any actual 3D projection. Ok, so let's see how much time we add by just adding some if blocks to check the texture coordinates:
// Almost all textures tile well.
if (kx > 1) kx -= 1;
else if (kx < 0) kx += 1;
if (ky > 1) ky -= 1;
else if (ky < 0) ky += 1;
Processed 1035 blocks of 16384 total blocks. Total time: 46.214000ms Render time: 45.750000ms Time taken for 100 frames: 4652ms Time per frame: 46520.000us FPS: 21.5
That's a big increase, considering its a simple if block. The reason is that the fragment code does an unbelievable number of iterations, so that if block is probably running millions of time per frame. All together the if block adds about 2.39ms, or 5%. That's a lot when you need less than 16ms for 60FPS. If a simple if block like that adds that much time, then what about the other if blocks in the fragment code?
inline void fragmentShaderRaw(DisplayBuffer* display, int x, int y, uint8_t z, Color textureColor) {
uint32_t displayIndex = display->width * y + x;
uint8_t* depthLoc = &(display->depthArray[displayIndex]);
if (*depthLoc < z)
// Discard the fragment
return;
if (textureColor >> 4 == 0)
// The fragment has no color; discard.
return;
Color fragColor = textureColor;
// Apply this fragment to the framebuffer
Color* outColor = &display->colorArray[displayIndex];
if (fragColor >> 4 == 15)
*outColor = fragColor;
else {
*outColor |= fragColor;
*outColor &= 0x7;
}
*depthLoc = z;
}
Read more »
Here's where I'm at now with the rendering:
I can render some of the terrain's shape, but the texture coordinates are not handled properly, and I can currently only render one chunk. This is the first part for getting the rendering to do what I want, after this I need to:Block Iteration Speed - How to Render Voxels Fast
I don't have screenshots, but originally my render for a whole chunk took over a second - so entirely unplayable. This was for two reasons:
It was easy to make the chunk sides not render. We don't want this because the player should never see the sides anyway - they would simply enter a new chunk. I think this was the bulk of it, because when I switch to the loop again, it now only takes 82ms to render. That's a 90%+ speed boost.
The harder part was eliminating the loop. How do you render the blocks without checking each one? Well, all the blocks that need to be rendered have two things in common. First, they all occur within the camera's view. Second, they all touch the same connected region of air. By using something akin to a floodfill algorithm, we can find only blocks visible to the player, and ignore about 90% of the blocks in the chunk - never even iterating over them.
A floodfill algorithm has expensive overhead, especially when considering the need to check the camera's viewport, which involves two matrix multiplications for every block. However, to iterate over every block in the chunk, it is 16x16x64 iterations, or 16384 iterations. For many blocks that aren't visible, this may cause rendering, and even for non-rendered blocks, it involves a series of checks for rendering. This is especially true of caves. So, it is much better never to iterate over those blocks at all. The overhead for the floodfill algorithm, as it turns out, is well worth it.
My floodfill algorithm processed only 1140 blocks, or 7% of the 16384 blocks in the chunk. It isn't perfect and probably has a bug or two, but still this is very pleasing.
The algorithm is a modified Breadth-First-Search. Here's the code:
// This essentially checks if the block is in the camera's view
bool isValidNode(int x, int y, int z) {
Vector3 pos = { x, y, z };
applyMat4fToVertex(&pos, &(settings.worldMatrix));
pos.z += 0.5f;
if (pos.z > 0 && pos.z < 0.1f)
pos.z = 0.1f;
float k = 1 / pos.z;
pos.x = (pos.x + (pos.x < 0 ? 0.5f : -0.5f)) * k;
pos.y = (pos.y + (pos.y < 0 ? 0.5f : -0.5f)) * k;
applyMat4fToVertex(&pos, &settings.projectionMatrix);
return pos.z >= 0 && pos.z < 260 && pos.x >= 0 && pos.y >= 0 && pos.x < width && pos.y < height;
}
void searchRender(uint8_t blocks[16][16][64], uint8_t camx, uint8_t camy, uint8_t camz) {
// This function does a breadth-first search, where the graph's nodes are every block within the camera's viewport.
// x, y, and z are the coordinates to start the search at.
Vector3iQueue queue(128);
uint16_t searchedBlocks[16][64];
bzero(searchedBlocks, 16*64*2);
queue.append(camx, camy, camz);
int blocksProcessed = 0;
uint8_t x, y, z;
while (queue.get(x, y, z)) {
// Fill it.
for (uint8_t q = 1; q < 64; q*=2) {
uint8_t i = x + ((q >> 0) & 1) - ((q >> 3) & 1);
uint8_t j = y + ((q >> 1) & 1) - ((q >> 4) & 1);
uint8_t k = z + ((q >> 2) & 1) - ((q >> 5) & 1);
// Skip blocks outside this chunk
if (i & 0xF0 || j & 0xF0 || k & 0x80)
continue;
// Mark the block as checked
if ((searchedBlocks[j][k] >> i) & 1)
continue;
searchedBlocks[j][k] |= 1 << i;
blocksProcessed++;
// Check if the block is in view:
if (!isValidNode(i, j, k))
continue;
// Should we render it, or fill it?
uint8_t block = blocks[i][j][k];
...
Read more »
I have the rendering pipeline mostly working, but I need something to actually render. At some point world gen will be required, so I thought I should just implement it now.
Most world gen uses perlin noise to create the heightmap, which is a critical feature of the terrain - perhaps the most important feature. The problem, is that it isn't very realistic on a larger scale. Here is a map of a rather large Minecraft world, generated with perlin noise like this, and it does not have realistic continents, islands, biomes, etc:
In real life, there are island chains, separated continents, deep trenches under the oceans, mountain chains, etc. All of these features are created by plate tectonics. But how can be simulate plate tectonics in a procedural generation algorithm?
Plate Tectonics in an Infinite Procedurally Generated World
When I started working on this world generator, I started by thinking that maybe I could generate a list of nearby points to a chunk, and make faultlines as lines between them, maybe modified with some noise. The issue, is that with a procedurally generated and infinite world, keeping track of and searching these points is not really feasible, and as far as I know, there is not an algorithm to compute the nearby points based on arbitrary given coordinates. This is important, because when we generate the world, we generate it one chunk at a time, where a chunk is a 16x16x64 tower of blocks that forms one section of the world. We cannot generate the entire world at once, because the world may be too large, even infinite. In essence, it isn't feasible to generate tectonic plates themselves. Thankfully, there is a great workaround.
A procedural world generator like this is, in essence, a function that describes the characteristics of the world at point x, y given a seed value. So, we need a function that describes the tectonic plate and/or nearby plates at the coordinate x, y. And we must do this without having a list of tectonic plates to pull from, as the number of tectonic plates is infinite.
Bent Space
Instead of generating the tectonic plates, it works far better to start with a square grid of tectonic plates, then project the x and y coordinates onto this grid using a bunch of cool math. Here is the same world as the one I showed in the first image, but without coordinate transformation. Note that faultines are still simulated:As you can see, it is very blocky. All I did to get from this to the world you saw in the first image, was apply changes to the x and y coordinates. This makes the blocky grid disappear, and allows us to bypass that tectonic plate issue.
Steps of Coordinate Alteration
The sequence of steps I apply is as follows:
For each of these features, a world config controls their strength, and the strength is then modulated with more perlin noise. Each layer of perlin noise should have a different seed, although looking at my code, it seems I didn't do this. Anyway, different regions of the world should have different coordinate-altering characteristics, leading to differences in the way the terrain looks and feels in different regions.
This coordinate...
Read more »
Background
I found out about the Teensy 4.0 shortly after it came out, and it quickly became my favorite MCU due to the wide range of capabilities, incredible horsepower, and low price. I think that MCU marked a huge change for the Arduino platform, because it was such a leap forward in processing power - it opens up some interesting avenues in the Arduino environment that just weren't open before. Since then I've hacked together a few interesting devices with the Teensy 4.1, but I haven't done much on these devices. They haven't really done anything cool. Part of the issue, is that I didn't build any great devices to run anything on. Everything sorta had problems or was hacked together. So, in the latest iteration of the #Arduino Desktop project, I designed a new platform that I can actually build some cool stuff on. We also have a new #uPD7220 Retro Graphics Card (and VGA hack) which I designed, which goes with the new Teensy system. With these in mind, I can contemplate a lot more interesting software to load to the Teensy: Namely, a 3D game.
Is the Teensy fast enough?
This is the first big question. With speeds approaching old phone processors, which ran Minecraft Pocket Edition, it feels like there's a chance; but I imagine those phones had hardware to accelerate graphics. We don't have that. So, I've translated some of my Java LWJGL code into something arduino compatible, and ran some speed tests:
Considering the above numbers, we can draw about 65 30x30 pixel quads every frame, if we want to hit 60hz. That's not much, but it's certainly something to work with. Each block is drawn with 3 quads, but at least one quad is almost always very small, so we can call it 2 quads - this gives us about 32 blocks we can render each frame at 60hz. I'm sure there's some optimizations or hacks I can do to improve this - it's a work in progress. Nonetheless, this is certainly enough to run a (small) 3D game, no?
Other math: I want a display with about 192000 pixels. If about 2/3 of the screen is to be rendered on average, then this gives us 128k pixels to render every frame. Each pixel is rendered as one or more fragments, so we have a minimum of 128k fragments to render. Each fragment takes about 0.3us to render when including quad processing, so the minimum render time should be approximately 38.4ms, or 26 FPS. That's actually not that bad, all things considered; I expected it to be much lower. I mean, I wouldn't notice that FPS; I used to play Minecraft at 14 FPS when I was a kid :)
Display Limitations, Color, and Loading Minecraft's old terrain.png File
To make things extra fun, I'm going to run this with a uPD7220 GDC to handle the video output. Due to limitations of graphics memory size and bandwidth, we only get 16 colors. We can't use 24 bit color anyway, as the Teensy 4.1 would run out of memory. Also, for textures and the framebuffer, the Teensy 4.1 has to track transparency along with the color. Thus:
Create an account to leave a comment. Already have an account? Log In.
Become a member to follow this project and never miss any updates