Git Product home page Git Product logo

blps's People

Contributors

rkr35 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

blps's Issues

Split sdk.rs

We're generating a 23 MB sdk.rs right now. Most text editors disable features like syntax highlighting to cope with the large file.
Searching through the file can be laggy, and moving around has its lapses.

Back in #1, we introduced the idea of a single, flat sdk.rs. The motivating reason for sdk.rs was that the structure hierarchy was getting too complex. The original implementation produced one file per structure and enum. It was tricky to manage the paths to
nested modules and submodules, and it was unsolved on how to reference structures from other modules/submodules.

Now that we worked through generating sdk.rs, it's become clearer on how to solve those problems.

Here's a layout that can work.

In sdk/mod.rs:

// Common attributes
#![allow(bindings_with_variant_name)]
#![allow(clippy::doc_markdown)]
#![allow(dead_code)]
#![allow(non_camel_case_types)]
#![allow(non_snake_case)]

// Common imports
use crate::game::{self, Array, FString, NameIndex, ScriptDelegate, ScriptInterface};
use crate::hook::bitfield::{is_bit_set, set_bit};
use crate::GLOBAL_OBJECTS;
use std::mem::MaybeUninit;
use std::ops::{Deref, DerefMut};

// One module and re-export per structure and enum.
mod text_buffer;
pub use text_buffer:*;

mod vector;
pub use vector::*;

mod e_axis;
pub use e_axis::*;

// and so on.

Per structure/enum:

sdk/text_buffer.rs:

// Get access to common imports + other structures/enums
use super::*;

/// Class Core.TextBuffer, 0x24 (0x60 - 0x3c)
#[repr(C)]
pub struct TextBuffer {
    // 0x0(0x3c)
    base: Object,

    // 0x3c(0x24)
    pad_at_0x3c: [u8; 0x24],
}

impl Deref for TextBuffer {
    type Target = Object;

    fn deref(&self) -> &Self::Target {
        &self.base
    }
}

impl DerefMut for TextBuffer {
    fn deref_mut(&mut self) -> &mut Self::Target {
        &mut self.base
    }
}

sdk/vector.rs:

use super::*;

/// ScriptStruct Core.Object.Vector, 0xc
#[repr(C)]
pub struct Vector {
    // 0x0(0x4)
    pub X: f32,

    // 0x4(0x4)
    pub Y: f32,

    // 0x8(0x4)
    pub Z: f32,
}

sdk/e_axis.rs:

#[repr(u8)]
pub enum EAxis {
    None,
    X,
    Y,
    Blank,
    Z,
    Max,
}

There are several benefits to splitting sdk.rs:

  1. Text editors won't choke on large buffers. Syntax highlighting restored.
  2. Fuzzy searching can help find structures.
  3. Diffs in one structure only affects one file and not the entire sdk.rs.
  4. Custom user implementations can be placed in the corresponding structure/enum file.

Disambiguate duplicate field names inside bitfields

Notice in the below structure that we generate multiple methods for USER_FLAG.
We need to number the USER_FLAGs like we do for enums and structure fields.

impl NxDestructibleDepthParameters {
    /// TAKE_IMPACT_DAMAGE
    pub fn is_take_impact_damage(&self) -> bool {
        is_bit_set(self.bitfield, 0)
    }

    /// TAKE_IMPACT_DAMAGE
    pub fn set_take_impact_damage(&mut self, value: bool) {
        set_bit(&mut self.bitfield, 0, value);
    }

    /// IGNORE_POSE_UPDATES
    pub fn is_ignore_pose_updates(&self) -> bool {
        is_bit_set(self.bitfield, 1)
    }

    /// IGNORE_POSE_UPDATES
    pub fn set_ignore_pose_updates(&mut self, value: bool) {
        set_bit(&mut self.bitfield, 1, value);
    }

    /// IGNORE_RAYCAST_CALLBACKS
    pub fn is_ignore_raycast_callbacks(&self) -> bool {
        is_bit_set(self.bitfield, 2)
    }

    /// IGNORE_RAYCAST_CALLBACKS
    pub fn set_ignore_raycast_callbacks(&mut self, value: bool) {
        set_bit(&mut self.bitfield, 2, value);
    }

    /// IGNORE_CONTACT_CALLBACKS
    pub fn is_ignore_contact_callbacks(&self) -> bool {
        is_bit_set(self.bitfield, 3)
    }

    /// IGNORE_CONTACT_CALLBACKS
    pub fn set_ignore_contact_callbacks(&mut self, value: bool) {
        set_bit(&mut self.bitfield, 3, value);
    }

    /// USER_FLAG
    pub fn is_user_flag(&self) -> bool {
        is_bit_set(self.bitfield, 4)
    }

    /// USER_FLAG
    pub fn set_user_flag(&mut self, value: bool) {
        set_bit(&mut self.bitfield, 4, value);
    }

    /// USER_FLAG
    pub fn is_user_flag(&self) -> bool {
        is_bit_set(self.bitfield, 5)
    }

    /// USER_FLAG
    pub fn set_user_flag(&mut self, value: bool) {
        set_bit(&mut self.bitfield, 5, value);
    }

    /// USER_FLAG
    pub fn is_user_flag(&self) -> bool {
        is_bit_set(self.bitfield, 6)
    }

    /// USER_FLAG
    pub fn set_user_flag(&mut self, value: bool) {
        set_bit(&mut self.bitfield, 6, value);
    }

    /// USER_FLAG
    pub fn is_user_flag(&self) -> bool {
        is_bit_set(self.bitfield, 7)
    }

    /// USER_FLAG
    pub fn set_user_flag(&mut self, value: bool) {
        set_bit(&mut self.bitfield, 7, value);
    }
}

Remove genial feature

#17 made genial the default code generator. We can remove the genial feature, conditional compilations, and codegen comments.

Emit offset of fields

The C++ SDK generator emits the offset of a structure's/class's field in a comment next to the field:

// Class Engine.HUD
// 0x009C (0x0224 - 0x0188)
class AHUD : public AActor
{
public:
	struct FColor                                      WhiteColor;                                               // 0x0188(0x0004) (Const)
	struct FColor                                      GreenColor;                                               // 0x018C(0x0004) (Const)
	struct FColor                                      RedColor;                                                 // 0x0190(0x0004) (Const)
	class APlayerController*                           PlayerOwner;                                              // 0x0194(0x0004)
	class AActor*                                      AnimDebugThis;                                            // 0x0198(0x0004)
	struct FName                                       AnimDebugStartingPoint;                                   // 0x019C(0x0008)
	unsigned long                                      bLostFocusPaused : 1;                                     // 0x01A4(0x0004) (Transient)
	unsigned long                                      bShowHUD : 1;                                             // 0x01A4(0x0004)
	unsigned long                                      bShowScores : 1;                                          // 0x01A4(0x0004)
	unsigned long                                      bShowDebugInfo : 1;                                       // 0x01A4(0x0004)
	unsigned long                                      bShowAnimDebug : 1;                                       // 0x01A4(0x0004)
	unsigned long                                      bShowBadConnectionAlert : 1;                              // 0x01A4(0x0004) (Edit)
	unsigned long                                      bMessageBeep : 1;                                         // 0x01A4(0x0004) (Config, GlobalConfig)
	unsigned long                                      bShowOverlays : 1;                                        // 0x01A4(0x0004)
	float                                              HudCanvasScale;                                           // 0x01A8(0x0004) (Config, GlobalConfig)

Emitting the offset next to the corresponding field helps verify that the generated structure matches the in-memory layout of the same structure, especially when doing the verification through ReClass.

Here's what that corresponding section of the HUD class currently looks like from the Rust generated SDK:

#[repr(C)]
pub struct HUD {
    base: Actor,
    pub WhiteColor: Color,
    pub GreenColor: Color,
    pub RedColor: Color,
    pub PlayerOwner: Option<&'static mut PlayerController>,
    pub AnimDebugThis: Option<&'static mut Actor>,
    pub AnimDebugStartingPoint: NameIndex,
    pub bitfield: u32,
    pub HudCanvasScale: f32,

(Where methods to query and modify bitfield is in an impl block for the structure)

Emit class methods

Classes have methods on them. These methods can have parameters and a return value.

Here's an example of a generated method from the C++ generator:

void UCanvas::SetPos(float PosX, float PosY, float PosZ)
{
	static auto fn = UObject::FindObject<UFunction>("Function Engine.Canvas.SetPos");

	UCanvas_SetPos_Params params;
	params.PosX = PosX;
	params.PosY = PosY;
	params.PosZ = PosZ;

	auto flags = fn->FunctionFlags;
	fn->FunctionFlags |= 0x400;

	UObject::ProcessEvent(fn, &params);

	fn->FunctionFlags = flags;
}

We don't emit methods right now, but we should.

We need to be careful about an empty Parameters structure in our implementation.
A zero-sized #[repr(C)] structure in Rust is 0 bytes. An empty structure in C++ is at least one byte.
Therefore, an empty #[repr(C)] Rust structure is not FFI-compatible to its C++ counterpart.

Hook ProcessEvent

I believe the following is the beginning of ProcessEvent:

0115D9F0 | 55 | push ebp |
0115D9F1 | 8BEC | mov ebp,esp |
0115D9F3 | 6A FF | push FFFFFFFF |
0115D9F5 | 68 B8E4FD01 | push borderlandspresequel.1FDE4B8 |  
0115D9FA | 64:A1 00000000 | mov eax,dword ptr fs:[0] |  
0115DA00 | 50 | push eax |  
0115DA01 | 83EC 50 | sub esp,50 |  
0115DA04 | A1 E0834902 | mov eax,dword ptr ds:[24983E0] |  
0115DA09 | 33C5 | xor eax,ebp |  
0115DA0B | 8945 F0 | mov dword ptr ss:[ebp-10],eax |  
0115DA0E | 53 | push ebx |  
0115DA0F | 56 | push esi |  
0115DA10 | 57 | push edi |  
0115DA11 | 50 | push eax |  
0115DA12 | 8D45 F4 | lea eax,dword ptr ss:[ebp-C] |  
0115DA15 | 64:A3 00000000 | mov dword ptr fs:[0],eax |  
0115DA1B | 8BF1 | mov esi,ecx |  
0115DA1D | 8975 EC | mov dword ptr ss:[ebp-14],esi |  
0115DA20 | 8B7D 08 | mov edi,dword ptr ss:[ebp+8] |  
0115DA23 | F787 80000000 02040000 | test dword ptr ds:[edi+80],402 |  

And here's one of the call sites:

mov esi,dword ptr ds:[ebx] ; Get pointer to first vtable entry of UObject.
push 0
lea edx,dword ptr ss:[ebp+8]
push edx
push 0
push eax
push ecx
mov ecx,ebx
movss dword ptr ss:[ebp+8],xmm0
call borderlandspresequel.10A0920
mov edx,dword ptr ds:[esi+E8] ; Index into vtable to get address of ProcessEvent
push eax
mov ecx,ebx ; ecx = ebx = this pointer = UObject we're calling ProcessEvent on
call edx ; Call ProcessEvent

The [esi+E8] suggests that the vtable index for ProcessEvent is 0xE8 / 4 = 58.

I'm assuming our detoured function will need to use the fastcall calling convention and ignore edx as the second parameter since Rust doesn't have a stable thiscall calling convention I could use in this scenario. Otherwise, I'm not sure how we would access the this pointer (which is the UObject that is calling ProcessEvent as a member function) that the game will store in ecx.

Create and integrate streaming code generator

From the block of text at the bottom of #14:

Implementing #13 makes me want to write a streaming code generator crate instead of using codegen.

Dumping is naturally a streaming process: you read the game structures from memory, cushion those structures into Rust syntax, and dump the code to sdk.rs. Sure, you do need to allocate some HashMap<&str, u8>s to create unique enum variants and identifiers, and sure, you need a Vec to keep track of bitfields, common enum variant prefixes, and method parameters, but on the whole, the bulk algorithm doesn't need to store previous states. Once you're done cushioning a structure, you can write to sdk.rs and start processing the next structure, forgetting about the previous.

codegen retains state of a created Scope. My sdk.rs is a single, large Scope. As I stream structures, that Scope grows. I dump the Scope into a String that gets dumped to sdk.rs. Along the way, Scope will allocate its own Strings and Vecs to keep track of enum variants, method parameters, etc. We don't need an intermediary Scope to keep state for us. We don't want this persistent state at all.

Another reason I want to write a streaming code generator is that I wouldn't need to allocate Strings for formatting purposes. I could write directly to the wrapped stream using write!.

For example, right now, if I want to emit an array type, I'd have to do something like:

let array_type  = format!("[{}; {}]", element_type, array_dim);
codegen_struct.field(array_name, array_type); // behind-the-scenes, allocate String and append to a Vec.

When I really want to do something like:

write!(&mut sdk, "[{}; {}]", element_type, array_dim)?;

Where sdk is an in-memory structure or a BufWriter<File> or generally anything that implements io::Write or fmt::Write.

Of course one of the biggest advantages of codegen is the automatic formatting, especially indenting in nested blocks. So if I wanted to replace codegen, I'd have to create some sort of abstraction on top of the raw write! calls, otherwise I'd be seeing a soup of \n and (indent) in format strings.

Strip common prefix from enum variants

Here's what a generated enum looks like:

#[repr(u8)]
pub enum EInterpCurveMode {
    CIM_Linear,
    CIM_CurveAuto,
    CIM_Constant,
    CIM_CurveUser,
    CIM_CurveBreak,
    CIM_CurveAutoClamped,
    CIM_MAX,
}

Since the variants of a Rust enum do not pollute the namespace that contains the enum, there is no need to prepend a common prefix to each enum variant. We should also normalize variant names so they are always PascalCase.

Here's how we should generate the above enum:

#[repr(u8)]
pub enum EInterpCurveMode {
    Linear,
    CurveAuto,
    Constant,
    CurveUser,
    CurveBreak,
    CurveAutoClamped,
    Max,
}

Prepend outer class name to constants

The current generation process emits duplicate constants. For example, // WPS_MusicVolume = 107 shows up 14 times in the generated SDK. @KN4CK3R's C++ generator resolves these duplicates by prepending the outer class name to the constant.

std::string MakeUniqueCppNameImpl(const T& t)
{
	std::string name;
	if (ObjectsStore().CountObjects<T>(t.GetName()) > 1)
	{
		name += MakeValidName(t.GetOuter().GetName()) + "_";
	}
	return name + MakeValidName(t.GetName());
}

We should also prepend the outer class name, but for all constants, regardless of whether they duplicate or not, since the outer class name provides context on where the constant can be used.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.