diff --git a/README.md b/README.md index a948c845..a46ac108 100644 --- a/README.md +++ b/README.md @@ -16,9 +16,7 @@ I'd appreciate if someone is willing to contribute or upload latest binaries. Bu - There is **no dynamic frequency scaling** in HOS, which makes _overclocking acts differently than PC_ or other mobile devices. The console will be _sticking to what frequency you've set in the long term_, until you close the game or put it into sleep. -- **ONLY ramp up RAM clock** beyond HOS maximum to 1862 / 1996 MHz if you want to _stay safe_. - -- Higher RAM clocks (> 1996.8 MHz) could be UNSTABLE and cause graphical glitches / instabilities / filesystem corruption. **Always make backup before usage.** +- Higher RAM clocks (> 1996.8 MHz) without proper timings could be UNSTABLE and cause graphical glitches / instabilities / filesystem corruption. **Always make backup before usage.** @@ -28,14 +26,15 @@ I'd appreciate if someone is willing to contribute or upload latest binaries. Bu - Most games are **bottlenecked by RAM bandwidth** - - Safe: 1862.4 / 1996.8 MHz - - 1862.4 / 1996.8 MHz is stable for all (Samsung / Micron / Hynix). - - Adjusted memory parameters (Mariko only). [Discussion](https://github.com/KazushiMe/Switch-OC-Suite/issues/5). + - Timings could be auto-adjusted (default), partially customized, or overwritten with entire mtc table. + + - Safe: ≤1996.8 MHz + - 1996.8 MHz has been tested stable for all (Samsung / Micron / Hynix), with built-in timing auto-adjustment. - Unsafe: > 1996.8 MHz or overvolting - - Higher RAM clocks might be stable for some chips without overvolting. [Not publicly available.](#Build) - - No evidence suggests that DRAM bus overvolting is useful. - - [Use this to set DRAM bus voltage](https://gist.github.com/KazushiMe/6bb0fcbefe0e03b1274079522516d56d). + - Timing customization: No GUI tool, requires [rebuilding](#Build). + - DRAM bus overvolting (Erista Only). + - Mariko: [use this to set DRAM bus voltage](https://gist.github.com/KazushiMe/6bb0fcbefe0e03b1274079522516d56d). - **[System Settings (Optional)](https://github.com/KazushiMe/Switch-OC-Suite/blob/master/system_settings.md)** @@ -96,9 +95,25 @@ I'd appreciate if someone is willing to contribute or upload latest binaries. Bu 2. Mariko Only: Copy all files in `SdOut` to the root of SD card. - Erista: Use official sys-clk instead. Only `loader.kip` and some benchmark homebrew are available. -3. Grab `x.x.x_loader_xxxx.x.kip` for your Atmosphere version and desired RAM frequency, rename it to `loader.kip` and place it in `/atmosphere/kips/`. +3. Grab `x.x.x_loader.kip` for your Atmosphere version, rename it to `loader.kip` and place it in `/atmosphere/kips/`. -4. **Hekate-ipl bootloader** +4. Customization + | Defaults | Mariko | Erista | + | ---------- | ------------- | ------------ | + | CPU OC | 2397 MHz Max | Disabled | + | CPU Volt | 1220 mV Max | Disabled | + | GPU OC | 1305 MHz Max | N/A | + | RAM OC | 1996 MHz Max | 1996 MHz Max | + | RAM Volt | N/A | Disabled | + | RAM Timing | Auto-Adjusted | Disabled | + + - No parser/editor currently, although you could easily customize those with a hex editor: + - Search for ASCII string `CUST` + - All values are little-endian + - Switch to override or replace mode, NOT insert mode + - See `ldr_oc_suite.hpp` for config struct + +5. **Hekate-ipl bootloader** - Rename the kip to `loader.kip` and add `kip1=atmosphere/kips/loader.kip` in `bootloader/hekate_ipl.ini` - Erista: Minerva module conflicts with HOS DRAM training. Recompile with frequency changed is recommeded, although you could simply remove `bootloader/sys/libsys_minerva.bso`. @@ -113,6 +128,8 @@ Grab necessary patches from the repo, then compile sys-clk, ReverseNX-RT and Atm If you are to install nro forwarders, remove `R_TRY(ValidateAcidSignature(std::addressof(g_original_meta_cache.meta)));` in `Atmosphere/stratosphere/loader/source/ldr_meta.cpp` to make them work again. +Uncompress the kip to make it work with config editor: `hactool -t kip1 Atmosphere/stratosphere/loader/loader.kip --uncompress=Atmosphere/stratosphere/loader/loader.kip` + ## Why no CPU/GPU OC for Erista? @@ -125,6 +142,7 @@ If you are to install nro forwarders, remove `R_TRY(ValidateAcidSignature(std::a - You could spot battery draining at higher clocks under stress test, even with official 39W PD charger. - CPU / GPU performance at max clocks will be worse if power supply is not enough. +- CPU OC (up to ~ 2.1 GHz, depending on your CPU bin) is available mainly for emulation, but it does NOT work out of the box. ## Acknowledgement diff --git a/Source/Atmosphere/stratosphere/loader/source/ldr_oc_patch.hpp b/Source/Atmosphere/stratosphere/loader/source/ldr_oc_suite.cpp similarity index 61% rename from Source/Atmosphere/stratosphere/loader/source/ldr_oc_patch.hpp rename to Source/Atmosphere/stratosphere/loader/source/ldr_oc_suite.cpp index 40b7f501..7359bed2 100644 --- a/Source/Atmosphere/stratosphere/loader/source/ldr_oc_patch.hpp +++ b/Source/Atmosphere/stratosphere/loader/source/ldr_oc_suite.cpp @@ -15,47 +15,79 @@ */ //#define EXPERIMENTAL -#pragma once #include +#include "ldr_oc_suite.hpp" -namespace ams::ldr { - #include "ldr_oc_type.hpp" +namespace ams::ldr::oc { /* Allocate CustomizeTable in loader.kip (could be customized by user without recompiling) */ - static volatile CustomizeTable Customized = { - /******************** C P U ********************/ - // Max Clock in kHz: - // >= 2193000 will enable overvolting (> 1120 mV) - .cpuMaxClock = 2397000, + static const volatile CustomizeTable C = { + /* DRAM Timing: + * AUTO_ADJ_MARIKO_SAFE: Auto adjust timings for LPDDR4 ≤3733 Mbps specs, 8Gb density (Default). + * AUTO_ADJ_MARIKO_4266: Auto adjust timings for LPDDR4X 4266 Mbps specs, 8Gb density. + * ENTIRE_TABLE_ERISTA/ENTIRE_TABLE_MARIKO: + * Replace the entire max mtc table with customized one. + * CUSTOMIZED_MARIKO: Override timings (partially). + */ + .mtcConf = AUTO_ADJ_MARIKO_SAFE; - // Max Voltage in mV: - // Default voltage: 1120 - // Haven't tested anything higher than 1220. - .cpuMaxVolt = 1220, + /* Mariko CPU: + * - Max Clock in kHz: + * Default: 1785000 + * >= 2193000 will enable overvolting (> 1120 mV) + * - Max Voltage in mV: + * Default voltage: 1120 + * Haven't tested anything higher than 1220. + */ + .marikoCpuMaxClock = 2397000, + .marikoCpuMaxVolt = 1220, - /******************** G P U ********************/ - // Max Clock in kHz: - .gpuMaxClock = 1305600, + /* Mariko GPU: + * - Max Clock in kHz: + * Default: 921600 + * NVIDIA Maximum: 1267200 + */ + .marikoGpuMaxClock = 1305600, - /****************** RAM / EMC ******************/ - // RAM(EMC) Clock in kHz: - // 1862400, 1894400, 1932800, 1996800, 2064000, 2099200, 2131200 - // (Other values might work as well) - // [Warning] RAM overclock could be UNSTABLE and cause severe problems: - // - Graphical glitches - // - System instabilities - // - NAND corruption - // 1862400/1996800 has been tested stable for all DRAM chips - .emcMaxClock = 1996800, + /* Erista CPU: + * Untested and not enabled by default. + * - Enable Overclock + * Require modificaitions towards NewCpuTables! + * - Max Voltage in mV + */ + .eristaCpuOCEnable = 0, + .eristaCpuMaxVolt = 0, + + /* Erista EMC: + * - RAM Voltage in mV + * Default(HOS): 1125 + * Not enabled by default. + */ + .eristaEmcVolt = 0, + + /* Common EMC: + * - RAM Clock in kHz: + * Values should be > 1600000, and divided evenly by 9600 or 12800. + * [WARNING] + * RAM overclock could be UNSTABLE if timing parameters are not suitable for your DRAM: + * - Graphical glitches + * - System instabilities + * - NAND corruption + * Timings from auto-adjustment have been tested safe for up to 1996.8 MHz for all DRAM chips. + */ + .commonEmcMaxClock = 1996800, }; - const u32 EmcClock = Customized.emcMaxClock; - const u32 CpuMaxClock = Customized.cpuMaxClock; - const u32 CpuMaxVolt = Customized.cpuMaxVolt; - const u32 GpuMaxClock = Customized.gpuMaxClock; - MarikoMtcTable* const MtcCustomized = const_cast(std::addressof(Customized.mtcTable)); + namespace pcv::Mariko { + constexpr u32 CpuClkOSLimit = 1785'000; + constexpr u32 CpuClkOfficial = 1963'500; + constexpr u32 CpuVoltOfficial = 1120; + constexpr u32 GpuClkOfficial = 1267'200; + constexpr u32 MemClkOSLimit = 1600'000; + constexpr u32 MemClkOSAlt = 1331'200; - namespace pcv { + constexpr u32 CommonDvfsEntryCnt = 32; /* CPU */ + constexpr u32 OldCpuDvfsEntryCnt = 18; constexpr cpu_freq_cvb_table_t NewCpuTables[] = { // OldCpuTables // { 204000, { 721589, -12695, 27 }, {} }, @@ -81,9 +113,10 @@ namespace ams::ldr { { 2295000, { 1975655, -43531, 27 }, { 1120000 } }, { 2397000, { 2076220, -45036, 27 }, { 1120000 } }, }; - static_assert(sizeof(NewCpuTables) <= sizeof(cpu_freq_cvb_table_t)*14); + static_assert(sizeof(NewCpuTables) <= sizeof(cpu_freq_cvb_table_t) * (CommonDvfsEntryCnt - OldCpuDvfsEntryCnt)); /* GPU */ + constexpr u32 OldGpuDvfsEntryCnt = 17; constexpr gpu_cvb_pll_table_t NewGpuTables[] = { // OldGpuTables // { 76800, {}, { 610000, } }, @@ -105,7 +138,7 @@ namespace ams::ldr { // { 1267200, {}, { 1335531, -12567, -867, 0, 3681, 559 } }, { 1305600, {}, { 1374130, -13725, -859, 0, 4442, 576 } }, }; - static_assert(sizeof(NewGpuTables) <= sizeof(gpu_cvb_pll_table_t)*15); + static_assert(sizeof(NewGpuTables) <= sizeof(gpu_cvb_pll_table_t) * (CommonDvfsEntryCnt - OldGpuDvfsEntryCnt)); /* GPU Max Clock asm Pattern: * @@ -123,9 +156,9 @@ namespace ams::ldr { 0x52820000, 0x72A001C0 }; - volatile u32 gpuMaxClockMarikoPattern[2] = { - 0x52800000 | ((GpuMaxClock & 0xFFFF) << 5), - 0x72A00000 | (((GpuMaxClock >> 16) & 0xFFFF) << 5) + const u32 gpuMaxClockMarikoPattern[2] = { + (gpuOfficialMarikoPattern[0] & 0xFFE00000) | ((C.marikoGpuMaxClock & 0xFFFF) << 5), + (gpuOfficialMarikoPattern[1] & 0xFFE00000) | (((C.marikoGpuMaxClock >> 16) & 0xFFFF) << 5) }; #define COMPARE_HIGH(val1, val2, bit_div) (((val1 ^ val2) >> bit_div) == 0) @@ -142,7 +175,36 @@ namespace ams::ldr { // { 1600000, { 675, 650, 637, } }, // }; - void AdjustMtcTable(MarikoMtcTable* table, MarikoMtcTable* ref) + static void MtcPllmbDivHandler(MarikoMtcTable* table) { + // Calculate DIVM and DIVN (clock divisors) + // Common PLL oscillator is 38.4 MHz + // PLLMB_OUT = 38.4 MHz / PLLLMB_DIVM * PLLMB_DIVN + typedef struct { + u8 numerator : 4; + u8 denominator : 4; + } pllmb_div; + + constexpr pllmb_div div[] = { + {3, 4}, {2, 3}, {1, 2}, {1, 3}, {1, 4}, {0, 1} + }; + + constexpr u32 pll_osc_in = 38400; + + u32 divm {}, divn {}; + const u32 remainder = C.commonEmcMaxClock % pll_osc_in; + for (const auto &index : div) { + if (remainder >= pll_osc_in * index.numerator / index.denominator) { + divm = index.denominator; + divn = C.commonEmcMaxClock / pll_osc_in * divm + index.numerator; + break; + } + } + + table->pllmb_divm = divm; + table->pllmb_divn = divn; + } + + static void MtcTableAutoAdjust(MarikoMtcTable* table, const MarikoMtcTable* ref) { /* Official Tegra X1 TRM, sign up for nvidia developer program (free) to download: * https://developer.nvidia.com/embedded/dlc/tegra-x1-technical-reference-manual @@ -161,8 +223,8 @@ namespace ams::ldr { * you'd better calculate timings yourself rather than relying on following algorithm. */ - #define ADJUST_PROP(TARGET, REF) \ - (u32)(std::ceil(REF + ((EmcClock-MemClkOSAlt)*(TARGET-REF))/(MemClkOSLimit-MemClkOSAlt))) + #define ADJUST_PROP(TARGET, REF) \ + (u32)(std::ceil(REF + ((C.commonEmcMaxClock-MemClkOSAlt)*(TARGET-REF))/(MemClkOSLimit-MemClkOSAlt))) #define ADJUST_PARAM(TARGET, REF) \ TARGET = ADJUST_PROP(TARGET, REF); @@ -202,43 +264,11 @@ namespace ams::ldr { // ADJUST_PARAM_TABLE(table, min_mrs_wait); // not used on LPDDR4X // ADJUST_PARAM_TABLE(table, latency); // not used - /* Patch PLLMB divisors */ - { - // Calculate DIVM and DIVN (clock divisors) - // Common PLL oscillator is 38.4 MHz - // PLLMB_OUT = 38.4 MHz / PLLLMB_DIVM * PLLMB_DIVN - u32 divm = 1; - u32 divn = EmcClock / 38400; - u32 remainder = EmcClock % 38400; - if (remainder >= 38400 * (3/4)) { - divm = 4; - divn = divn * divm + 3; - } else - if (remainder >= 38400 * (2/3)) { - divm = 3; - divn = divn * divm + 2; - } else - if (remainder >= 38400 * (1/2)) { - divm = 2; - divn = divn * divm + 1; - } else - if (remainder >= 38400 * (1/3)) { - divm = 3; - divn = divn * divm + 1; - } else - if (remainder >= 38400 * (1/4)) { - divm = 4; - divn = divn * divm + 1; - } - - table->pllmb_divm = divm; - table->pllmb_divn = divn; - } - /* Timings that are available in or can be derived from LPDDR4X datasheet or TRM */ { + const bool use_4266_spec = C.mtcConf == AUTO_ADJ_MARIKO_4266; // tCK_avg (average clock period) in ns - const double tCK_avg = (EmcClock == 2131200) ? 0.468 : 1000'000. / EmcClock; + const double tCK_avg = 1000'000. / C.commonEmcMaxClock; // tRPpb (row precharge time per bank) in ns const u32 tRPpb = 18; // tRPab (row precharge time all banks) in ns @@ -254,7 +284,7 @@ namespace ams::ldr { // tRCD (RAS-CAS delay) in ns const u32 tRCD = 18; // tRRD (Active bank-A to Active bank-B) in ns - const double tRRD = (EmcClock == 2131200) ? 7.5 : 10.; + const double tRRD = use_4266_spec ? 7.5 : 10.; // tREFpb (average refresh interval per bank) in ns for 8Gb density const u32 tREFpb = 488; // tREFab (average refresh interval all 8 banks) in ns for 8Gb density @@ -264,7 +294,7 @@ namespace ams::ldr { // {REFRESH, REFRESH_LO} = max[(tREF/#_of_rows) / (emc_clk_period) - 64, (tREF/#_of_rows) / (emc_clk_period) * 97%] // emc_clk_period = dram_clk / 2; // 1600 MHz: 5894, but N' set to 6176 (~4.8% margin) - const u32 REFRESH = std::ceil((double(tREFpb) * EmcClock / numOfRows * (1.048) / 2 - 64)) / 4 * 4; + const u32 REFRESH = u32(std::ceil((double(tREFpb) * C.commonEmcMaxClock / numOfRows * 1.048 / 2 - 64))) / 4 * 4; // tPDEX2WR, tPDEX2RD (timing delay from exiting powerdown mode to a write/read command) in ns const u32 tPDEX2 = 10; // [Guessed] tACT2PDEN (timing delay from an activate, MRS or EMRS command to power-down entry) in ns @@ -282,9 +312,9 @@ namespace ams::ldr { // [Guessed] tPD (minimum CKE low pulse width in power-down mode) in ns const double tPD = 7.5; // tFAW (Four-bank Activate Window) in ns - const u32 tFAW = (EmcClock == 2131200) ? 30 : 40; + const u32 tFAW = use_4266_spec ? 30 : 40; - #define GET_CYCLE_CEIL(PARAM) std::ceil(double(PARAM) / tCK_avg) + #define GET_CYCLE_CEIL(PARAM) u32(std::ceil(double(PARAM) / tCK_avg)) WRITE_PARAM_ALL_REG(table, emc_rc, GET_CYCLE_CEIL(tRC)); WRITE_PARAM_ALL_REG(table, emc_rfc, GET_CYCLE_CEIL(tRFCab)); @@ -324,6 +354,8 @@ namespace ams::ldr { table->burst_mc_regs.mc_emem_arb_timing_rfcpb = std::ceil(GET_CYCLE_CEIL(tRFCpb) / MC_ARB_DIV + 1); // ? } + MtcPllmbDivHandler(table); + #ifdef EXPERIMENTAL { #define ADJUST_PARAM_ROUND2_ALL_REG(TARGET_TABLE, REF_TABLE, PARAM) \ @@ -363,7 +395,7 @@ namespace ams::ldr { | ADJUST_BIT(TARGET_TABLE->shadow_regs_rdwr_train.PARAM, REF_TABLE->shadow_regs_rdwr_train.PARAM, HIGH2, LOW2) << LOW2; /* For latency allowance */ - #define ADJUST_INVERSE(TARGET) (TARGET * (MemClkOSLimit / 1000) / (EmcClock / 1000)) + #define ADJUST_INVERSE(TARGET) (TARGET * (MemClkOSLimit / 1000) / (C.commonEmcMaxClock / 1000)) /* emc_wdv, emc_wsv, emc_wev, emc_wdv_mask, emc_quse, emc_quse_width, emc_ibdly, emc_obdly, @@ -661,19 +693,187 @@ namespace ams::ldr { #endif } - Result PcvCpuClockVddHandler(u32* ptr) { - u32 value_next2 = *(ptr + 2); - constexpr u32 cpuClockVddCpuPatternNext = 0; - if (value_next2 != cpuClockVddCpuPatternNext) - { - return ResultFailure(); - } + static void MtcTableCustomize(MarikoMtcTable* table) { + #define HANDLER_COMMON(PARAM) \ + if (C.marikoTiming.common.PARAM != DO_NOT_OVERRIDE) {\ + u32 cache = C.marikoTiming.common.PARAM; \ + if (cache == OVERRIDE_WITH_ZERO) \ + cache = 0; \ + table->burst_regs.PARAM = cache; \ + table->shadow_regs_ca_train.PARAM = cache; \ + table->shadow_regs_rdwr_train.PARAM = cache; \ + } - PatchOffset(ptr, CpuMaxClock); + #define HANDLER(PARAM) \ + if (C.marikoTiming.PARAM) { \ + u32 cache = C.marikoTiming.PARAM;\ + if (cache == OVERRIDE_WITH_ZERO) \ + cache = 0; \ + table->PARAM = cache; \ + } + + HANDLER_COMMON(emc_rc); + HANDLER_COMMON(emc_rfc); + HANDLER_COMMON(emc_rfcpb); + HANDLER_COMMON(emc_ras); + HANDLER_COMMON(emc_rp); + HANDLER_COMMON(emc_r2w); + HANDLER_COMMON(emc_w2r); + HANDLER_COMMON(emc_r2p); + HANDLER_COMMON(emc_w2p); + HANDLER_COMMON(emc_trtm); + HANDLER_COMMON(emc_twtm); + HANDLER_COMMON(emc_tratm); + HANDLER_COMMON(emc_twatm); + HANDLER_COMMON(emc_rd_rcd); + HANDLER_COMMON(emc_wr_rcd); + HANDLER_COMMON(emc_rrd); + HANDLER_COMMON(emc_wdv); + HANDLER_COMMON(emc_wsv); + HANDLER_COMMON(emc_wev); + HANDLER_COMMON(emc_wdv_mask); + HANDLER_COMMON(emc_quse); + HANDLER_COMMON(emc_quse_width); + HANDLER_COMMON(emc_ibdly); + HANDLER_COMMON(emc_obdly); + HANDLER_COMMON(emc_einput); + HANDLER_COMMON(emc_einput_duration); + HANDLER_COMMON(emc_qrst); + HANDLER_COMMON(emc_qsafe); + HANDLER_COMMON(emc_rdv); + HANDLER_COMMON(emc_rdv_mask); + HANDLER_COMMON(emc_rdv_early); + HANDLER_COMMON(emc_rdv_early_mask); + HANDLER_COMMON(emc_refresh); + HANDLER_COMMON(emc_pre_refresh_req_cnt); + HANDLER_COMMON(emc_pdex2wr); + HANDLER_COMMON(emc_pdex2rd); + HANDLER_COMMON(emc_act2pden); + HANDLER_COMMON(emc_rw2pden); + HANDLER_COMMON(emc_cke2pden); + HANDLER_COMMON(emc_pdex2mrr); + HANDLER_COMMON(emc_txsr); + HANDLER_COMMON(emc_txsrdll); + HANDLER_COMMON(emc_tcke); + HANDLER_COMMON(emc_tckesr); + HANDLER_COMMON(emc_tpd); + HANDLER_COMMON(emc_tfaw); + HANDLER_COMMON(emc_trpab); + HANDLER_COMMON(emc_tclkstop); + HANDLER_COMMON(emc_trefbw); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dq_rank1_4); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dq_rank1_5); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dqs_rank0_0); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dqs_rank0_1); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dqs_rank0_3); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dqs_rank0_4); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dqs_rank0_5); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dqs_rank1_0); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dqs_rank1_1); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dqs_rank1_3); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dqs_rank1_4); + HANDLER_COMMON(emc_pmacro_ob_ddll_long_dqs_rank1_5); + HANDLER_COMMON(emc_pmacro_ddll_long_cmd_0); + HANDLER_COMMON(emc_pmacro_ddll_long_cmd_1); + HANDLER_COMMON(emc_pmacro_ddll_long_cmd_2); + HANDLER_COMMON(emc_pmacro_ddll_long_cmd_3); + HANDLER_COMMON(emc_pmacro_ddll_long_cmd_4); + HANDLER_COMMON(emc_zcal_wait_cnt); + HANDLER_COMMON(emc_mrs_wait_cnt); + HANDLER_COMMON(emc_mrs_wait_cnt2); + HANDLER_COMMON(emc_auto_cal_channel); + HANDLER_COMMON(emc_pmacro_dll_cfg_2); + HANDLER_COMMON(emc_pmacro_autocal_cfg_common); + HANDLER_COMMON(emc_dyn_self_ref_control); + HANDLER_COMMON(emc_qpop); + HANDLER_COMMON(emc_pmacro_cmd_pad_tx_ctrl); + HANDLER_COMMON(emc_tr_timing_0); + HANDLER_COMMON(emc_tr_rdv); + HANDLER_COMMON(emc_tr_qpop); + HANDLER_COMMON(emc_tr_rdv_mask); + HANDLER_COMMON(emc_tr_qsafe); + HANDLER_COMMON(emc_tr_qrst); + HANDLER_COMMON(emc_training_vref_settle); + + HANDLER(trim_regs.emc_pmacro_ob_ddll_long_dq_rank0_0); + HANDLER(trim_regs.emc_pmacro_ob_ddll_long_dq_rank0_1); + HANDLER(trim_regs.emc_pmacro_ob_ddll_long_dq_rank0_2); + HANDLER(trim_regs.emc_pmacro_ob_ddll_long_dq_rank0_3); + HANDLER(trim_regs.emc_pmacro_ob_ddll_long_dq_rank0_4); + HANDLER(trim_regs.emc_pmacro_ob_ddll_long_dq_rank0_5); + HANDLER(trim_regs.emc_pmacro_ob_ddll_long_dq_rank1_0); + HANDLER(trim_regs.emc_pmacro_ob_ddll_long_dq_rank1_1); + HANDLER(trim_regs.emc_pmacro_ob_ddll_long_dq_rank1_2); + HANDLER(trim_regs.emc_pmacro_ob_ddll_long_dq_rank1_3); + + HANDLER(dram_timings.rl); + + HANDLER(burst_mc_regs.mc_emem_arb_cfg); + HANDLER(burst_mc_regs.mc_emem_arb_timing_rcd); + HANDLER(burst_mc_regs.mc_emem_arb_timing_rp); + HANDLER(burst_mc_regs.mc_emem_arb_timing_rc); + HANDLER(burst_mc_regs.mc_emem_arb_timing_ras); + HANDLER(burst_mc_regs.mc_emem_arb_timing_faw); + HANDLER(burst_mc_regs.mc_emem_arb_timing_wap2pre); + HANDLER(burst_mc_regs.mc_emem_arb_timing_r2w); + HANDLER(burst_mc_regs.mc_emem_arb_timing_w2r); + HANDLER(burst_mc_regs.mc_emem_arb_timing_rfcpb); + HANDLER(burst_mc_regs.mc_emem_arb_da_turns); + HANDLER(burst_mc_regs.mc_emem_arb_da_covers); + HANDLER(burst_mc_regs.mc_emem_arb_misc0); + + HANDLER(la_scale_regs.mc_mll_mpcorer_ptsa_rate); + HANDLER(la_scale_regs.mc_ptsa_grant_decrement); + HANDLER(la_scale_regs.mc_latency_allowance_xusb_0); + HANDLER(la_scale_regs.mc_latency_allowance_xusb_1); + HANDLER(la_scale_regs.mc_latency_allowance_tsec_0); + HANDLER(la_scale_regs.mc_latency_allowance_sdmmca_0); + HANDLER(la_scale_regs.mc_latency_allowance_sdmmcaa_0); + HANDLER(la_scale_regs.mc_latency_allowance_sdmmc_0); + HANDLER(la_scale_regs.mc_latency_allowance_sdmmcab_0); + HANDLER(la_scale_regs.mc_latency_allowance_ppcs_1); + HANDLER(la_scale_regs.mc_latency_allowance_mpcore_0); + HANDLER(la_scale_regs.mc_latency_allowance_hc_0); + HANDLER(la_scale_regs.mc_latency_allowance_hc_1); + HANDLER(la_scale_regs.mc_latency_allowance_avpc_0); + HANDLER(la_scale_regs.mc_latency_allowance_gpu_0); + HANDLER(la_scale_regs.mc_latency_allowance_gpu2_0); + HANDLER(la_scale_regs.mc_latency_allowance_nvenc_0); + HANDLER(la_scale_regs.mc_latency_allowance_nvdec_0); + HANDLER(la_scale_regs.mc_latency_allowance_vic_0); + HANDLER(la_scale_regs.mc_latency_allowance_vi2_0); + HANDLER(la_scale_regs.mc_latency_allowance_isp2_1); + + HANDLER(pllm_ss_ctrl1); + HANDLER(pllm_ss_ctrl2); + HANDLER(pllmb_ss_ctrl1); + HANDLER(pllmb_ss_ctrl2); + HANDLER(pllmb_divm); + HANDLER(pllmb_divn); + HANDLER(min_mrs_wait); + HANDLER(emc_mrw); + HANDLER(emc_mrw2); + HANDLER(emc_cfg_2); + HANDLER(latency); + + if (C.marikoTiming.pllmb_divm == DO_NOT_OVERRIDE || C.marikoTiming.pllmb_divn == DO_NOT_OVERRIDE) + MtcPllmbDivHandler(table); + } + + static Result CpuClockVddHandler(u32* ptr) { + if (C.marikoCpuMaxClock) { + u32 value_next2 = *(ptr + 2); + constexpr u32 cpuClockVddCpuPatternNext = 0; + if (value_next2 != cpuClockVddCpuPatternNext) + return ResultFailure(); + + PatchOffset(ptr, C.marikoCpuMaxClock); + } return ResultSuccess(); } - Result PcvCpuDvfsHandler(cpu_freq_cvb_table_t* entry_1963, uintptr_t nso_end_offset) { + static Result CpuDvfsHandler(u32* ptr, uintptr_t nso_end_offset) { + cpu_freq_cvb_table_t* entry_1963 = reinterpret_cast(ptr); cpu_freq_cvb_table_t* entry_free = entry_1963 + 1; cpu_freq_cvb_table_t* entry_204 = entry_free - 18; uintptr_t entry_end_offset = reinterpret_cast(entry_free) + sizeof(NewCpuTables) - sizeof(u32); @@ -693,16 +893,18 @@ namespace ams::ldr { if (entry_current->cvb_pll_param.c0 != CpuVoltOfficial * 1000) return ResultFailure(); - while (entry_current->cvb_pll_param.c0 == CpuVoltOfficial * 1000) - { - PatchOffset(reinterpret_cast(std::addressof(entry_current->cvb_pll_param)), CpuMaxVolt * 1000); - entry_current--; + if (C.marikoCpuMaxVolt) { + while (entry_current->cvb_pll_param.c0 == CpuVoltOfficial * 1000) { + PatchOffset(reinterpret_cast(std::addressof(entry_current->cvb_pll_param)), C.marikoCpuMaxVolt * 1000); + entry_current--; + } } return ResultSuccess(); } - Result PcvGpuDvfsHandler(gpu_cvb_pll_table_t* entry_1267, uintptr_t nso_end_offset) { + static Result GpuDvfsHandler(u32* ptr, uintptr_t nso_end_offset) { + gpu_cvb_pll_table_t* entry_1267 = reinterpret_cast(ptr); gpu_cvb_pll_table_t* entry_free = entry_1267 + 1; gpu_cvb_pll_table_t* entry_76_8 = entry_free - 17; uintptr_t entry_end_offset = reinterpret_cast(entry_free) + sizeof(NewGpuTables) - sizeof(u32); @@ -719,23 +921,24 @@ namespace ams::ldr { return ResultSuccess(); } - Result PcvCpuVoltRangeHandler(u32* ptr) { - const std::vector acceptableCpuMinVolt = { 800, 637, 620, 610 }; + static Result CpuVoltRangeHandler(u32* ptr) { + if (!C.marikoCpuMaxVolt) + return ResultSuccess(); + u32 value_cpu_min_volt = *(ptr - 1); - - for (const auto &min_volt : acceptableCpuMinVolt) - { - if (min_volt == value_cpu_min_volt) - { - PatchOffset(ptr, CpuMaxVolt); + switch (value_cpu_min_volt) { + case 800: + case 637: + case 620: + case 610: + PatchOffset(ptr, C.marikoCpuMaxVolt); return ResultSuccess(); - } + default: + return ResultFailure(); } - - return ResultFailure(); } - Result PcvGpuMaxClockMarikoAsmHandler(u32* ptr) { + static Result GpuMaxClockHandler(u32* ptr) { u32 value = *(ptr); u32* ptr_next = ptr + 1; u32 value_next = *(ptr_next); @@ -754,149 +957,314 @@ namespace ams::ldr { return ResultFailure(); } - Result PcvMemMaxClockHandler(uintptr_t ptr, bool isMariko) { - if (isMariko) - { - // Mariko have 3 mtc tables (204/1331/1600 MHz), only these 3 frequencies could be set. - // Replace 1331 MHz with 1600 MHz as perf @ 1331 MHz is crap. - u32 value_next = *(reinterpret_cast(ptr) + 1); - u32 value_next2 = *(reinterpret_cast(ptr) + 2); + static Result MtcTableHandler(u32* ptr) { + MarikoMtcTable* const mtc_table_max = reinterpret_cast(ptr - offsetof(MarikoMtcTable, rate_khz) / sizeof(u32)); + MarikoMtcTable* const mtc_table_alt = mtc_table_max - 1; + constexpr u32 mtc_mariko_rev = 3; + if ( mtc_table_max->rev != mtc_mariko_rev + || mtc_table_alt->rev != mtc_mariko_rev + || mtc_table_alt->rate_khz != MemClkOSAlt ) + return ResultFailure(); - constexpr u32 mtc_mariko_min_volt = 1100; - // constexpr u32 mtc_erista_min_volt = 887; - constexpr u32 dvb_entry_volt = 675; - constexpr u32 mtc_mariko_rev = 3; - - if (value_next == mtc_mariko_min_volt) - { - MarikoMtcTable* const mtc_table_max = reinterpret_cast(ptr - offsetof(MarikoMtcTable, rate_khz)); - MarikoMtcTable* const mtc_table_alt = mtc_table_max - 1; - if ( mtc_table_max->rev != mtc_mariko_rev - || mtc_table_alt->rev != mtc_mariko_rev - || mtc_table_alt->rate_khz != MemClkOSAlt ) - return ResultFailure(); - - bool useCustomizedTable = MtcCustomized->rev != INVALID_MTC_TABLE; - if (useCustomizedTable) - { - std::memcpy(reinterpret_cast(mtc_table_alt), reinterpret_cast(mtc_table_max), sizeof(MarikoMtcTable)); - std::memcpy(reinterpret_cast(mtc_table_max), reinterpret_cast(MtcCustomized), sizeof(MarikoMtcTable)); - return ResultSuccess(); - } - - std::memcpy(reinterpret_cast(MtcCustomized), reinterpret_cast(mtc_table_max), sizeof(MarikoMtcTable)); - AdjustMtcTable(mtc_table_max, mtc_table_alt); - std::memcpy(reinterpret_cast(mtc_table_alt), reinterpret_cast(MtcCustomized), sizeof(MarikoMtcTable)); - MtcCustomized->rev = INVALID_MTC_TABLE; - } - else if (value_next2 == dvb_entry_volt) - { - emc_dvb_dvfs_table_t* dvb_max_entry = reinterpret_cast(ptr); - emc_dvb_dvfs_table_t* dvb_1331_entry = dvb_max_entry - 1; - - u32* dvb_1331_offset = reinterpret_cast(dvb_1331_entry); - if (*(dvb_1331_offset) != MemClkOSAlt) - return ResultFailure(); - - PatchOffset(dvb_1331_offset, MemClkOSLimit); - } + MarikoMtcTable* const table = const_cast(std::addressof(C.marikoMtc)); + bool replace_entire_table = (C.mtcConf == ENTIRE_TABLE_MARIKO); + if (replace_entire_table) { + std::memcpy(reinterpret_cast(mtc_table_alt), reinterpret_cast(mtc_table_max), sizeof(MarikoMtcTable)); + std::memcpy(reinterpret_cast(mtc_table_max), reinterpret_cast(table), sizeof(MarikoMtcTable)); + return ResultSuccess(); } - PatchOffset(ptr, EmcClock); + bool customized_timing = (C.mtcConf == CUSTOMIZED_MARIKO); + if (customized_timing) { + std::memcpy(reinterpret_cast(mtc_table_alt), reinterpret_cast(mtc_table_max), sizeof(MarikoMtcTable)); + MtcTableCustomize(mtc_table_max); + } else { + std::memcpy(reinterpret_cast(table), reinterpret_cast(mtc_table_max), sizeof(MarikoMtcTable)); + MtcTableAutoAdjust(mtc_table_max, mtc_table_alt); + std::memcpy(reinterpret_cast(mtc_table_alt), reinterpret_cast(table), sizeof(MarikoMtcTable)); + } return ResultSuccess(); } - void ApplyAutoPcvPatch(uintptr_t mapped_nso, size_t nso_size) { - /* Abort immediately once something goes wrong */ - bool isMariko = (spl::GetSocType() == spl::SocType_Mariko); + static Result DvbTableHandler(u32* ptr) { + emc_dvb_dvfs_table_t* dvb_max_entry = reinterpret_cast(ptr); + emc_dvb_dvfs_table_t* dvb_1331_entry = dvb_max_entry - 1; - u8 cpuClockVddMariko {}; - u8 cpuTableMariko {}; - u8 gpuTableMariko {}; - u8 cpuMaxVoltMariko {}; - u8 gpuMaxClockMariko {}; + u32* dvb_1331_offset = reinterpret_cast(dvb_1331_entry); + if (*(dvb_1331_offset) != MemClkOSAlt) + return ResultFailure(); - uintptr_t ptr = mapped_nso; - while (ptr <= mapped_nso + nso_size - std::max(sizeof(MarikoMtcTable), sizeof(EristaMtcTable))) - { - u32 value = *(reinterpret_cast(ptr)); + PatchOffset(dvb_1331_offset, MemClkOSLimit); + return ResultSuccess(); + } - if (isMariko) - { - if (value == CpuClkOSLimit) - { - if (R_SUCCEEDED(PcvCpuClockVddHandler(reinterpret_cast(ptr)))) - cpuClockVddMariko++; - } + static Result MemMaxClockHandler(u32* ptr) { + u32 value_next = *(ptr + 1); + u32 value_next2 = *(ptr + 2); - if (value == CpuClkOfficial) - { - if (R_SUCCEEDED(PcvCpuDvfsHandler(reinterpret_cast(ptr), mapped_nso + nso_size))) - cpuTableMariko++; - } + // Mariko have 3 mtc tables (204/1331/1600 MHz), only these 3 frequencies could be set. + // Replace 1331 MHz with 1600 MHz as perf @ 1331 MHz is crap. + constexpr u32 mtc_min_volt = 1100; + constexpr u32 dvb_entry_volt = 675; - if (value == GpuClkOfficial) - { - if (R_SUCCEEDED(PcvGpuDvfsHandler(reinterpret_cast(ptr), mapped_nso + nso_size))) - gpuTableMariko++; - } + Result rc = ResultSuccess(); - if (value == CpuVoltOfficial) - { - if (R_SUCCEEDED(PcvCpuVoltRangeHandler(reinterpret_cast(ptr)))) - cpuMaxVoltMariko++; - } - - if (COMPARE_HIGH(value, gpuOfficialMarikoPattern[0], 5)) - { - if (R_SUCCEEDED(PcvGpuMaxClockMarikoAsmHandler(reinterpret_cast(ptr)))) - gpuMaxClockMariko++; - } - } - - if (value == MemClkOSLimit) - { - if (R_FAILED(PcvMemMaxClockHandler(ptr, isMariko))) - AMS_ABORT(); - } - - ptr += sizeof(u32); + if (value_next == mtc_min_volt) { + rc = MtcTableHandler(ptr); + } else if (value_next2 == dvb_entry_volt) { + rc = DvbTableHandler(ptr); } - if (isMariko) - { - constexpr u8 cpuMaxVoltMarikoMaxCnt = 13; - constexpr u8 gpuMaxClockMarikoReqCnt = 2; + PatchOffset(ptr, C.commonEmcMaxClock); + return rc; + } - if (cpuClockVddMariko != 1) - AMS_ABORT(); - if (cpuTableMariko != 1) - AMS_ABORT(); - if (gpuTableMariko != 1) - AMS_ABORT(); - if (cpuMaxVoltMariko > cpuMaxVoltMarikoMaxCnt || !cpuMaxVoltMariko) - AMS_ABORT(); - if (gpuMaxClockMariko != gpuMaxClockMarikoReqCnt) - AMS_ABORT(); + static void Patch(uintptr_t mapped_nso, size_t nso_size) { + if (C.custRev != CUST_REV) { + AMS_ABORT(); + __builtin_unreachable(); + } + + enum PatchSuccessCnt { + MEM_CLOCK, + CPU_CLOCK_VDD, + CPU_TABLE, + GPU_TABLE, + CPU_MAX_VOLT, + GPU_MAX_CLOCK, + CNT_MAX, + }; + + u8 cnt[CNT_MAX] = {}; + + for (uintptr_t ptr = mapped_nso; + ptr <= mapped_nso + nso_size - sizeof(MarikoMtcTable); + ptr += sizeof(u32)) + { + u32* ptr32 = reinterpret_cast(ptr); + u32 value = *(ptr32); + + switch (value) { + case CpuClkOSLimit: [[unlikely]] + if (R_SUCCEEDED(CpuClockVddHandler(ptr32))) + cnt[CPU_CLOCK_VDD]++; + continue; + case CpuClkOfficial: [[unlikely]] + if (R_SUCCEEDED(CpuDvfsHandler(ptr32, mapped_nso + nso_size))) + cnt[CPU_TABLE]++; + continue; + case GpuClkOfficial: [[unlikely]] + if (R_SUCCEEDED(GpuDvfsHandler(ptr32, mapped_nso + nso_size))) + cnt[GPU_TABLE]++; + continue; + case CpuVoltOfficial:[[unlikely]] + if (R_SUCCEEDED(CpuVoltRangeHandler(ptr32))) + cnt[CPU_MAX_VOLT]++; + continue; + case MemClkOSLimit: [[unlikely]] + if (R_SUCCEEDED(MemMaxClockHandler(ptr32))) + cnt[MEM_CLOCK]++; + continue; + default: [[likely]] + break; + } + + if (COMPARE_HIGH(value, gpuOfficialMarikoPattern[0], 5)) { + if (R_SUCCEEDED(GpuMaxClockHandler(ptr32))) + cnt[GPU_MAX_CLOCK]++; + continue; + } + } + + if ( !cnt[MEM_CLOCK] + || cnt[CPU_CLOCK_VDD] != 1 + || cnt[CPU_TABLE] != 1 + || cnt[GPU_TABLE] != 1 + || cnt[CPU_MAX_VOLT] > 13 || !cnt[CPU_MAX_VOLT] + || cnt[GPU_MAX_CLOCK] != 2) + { + AMS_ABORT(); + __builtin_unreachable(); } } } + namespace pcv::Erista { + constexpr u32 CpuClkOSLimit = 1785'000; + constexpr u32 CpuVoltLimit1 = 1132; + constexpr u32 CpuVoltLimit2 = 1170; + constexpr u32 CpuVoltLimit3 = 1227; + constexpr u32 MemVoltHOS = 1125'000; + constexpr u32 MemClkOSLimit = 1600'000; + + constexpr cpu_freq_cvb_table_t NewCpuTables[] = { + // OldCpuTables + // { 204000, { 721094 }, {} }, + // { 306000, { 754040 }, {} }, + // { 408000, { 786986 }, {} }, + // { 510000, { 819932 }, {} }, + // { 612000, { 852878 }, {} }, + // { 714000, { 885824 }, {} }, + // { 816000, { 918770 }, {} }, + // { 918000, { 951716 }, {} }, + // { 1020000, { 984662 }, { -2875621, 358099, -8585 } }, + // { 1122000, { 1017608 }, { -52225, 104159, -2816 } }, + // { 1224000, { 1050554 }, { 1076868, 8356, -727 } }, + // { 1326000, { 1083500 }, { 2208191, -84659, 1240 } }, + // { 1428000, { 1116446 }, { 2519460, -105063, 1611 } }, + // { 1581000, { 1130000 }, { 2889664, -122173, 1834 } }, + // { 1683000, { 1168000 }, { 5100873, -279186, 4747 } }, + // { 1785000, { 1227500 }, { 5100873, -279186, 4747 } }, + { 1887000, {}, {} }, + { 1989000, {}, {} }, + { 2091000, {}, {} }, + }; + + static Result CpuDvfsHandler(u32* ptr, uintptr_t nso_end_offset) { + if (!C.eristaCpuOCEnable) + return ResultSuccess(); + + cpu_freq_cvb_table_t* entry_1785 = reinterpret_cast(ptr); + cpu_freq_cvb_table_t* entry_free = entry_1785 + 1; + cpu_freq_cvb_table_t* entry_204 = entry_free - 16; + uintptr_t entry_end_offset = reinterpret_cast(entry_free) + sizeof(NewCpuTables) - sizeof(u32); + + if ( entry_end_offset >= nso_end_offset + || *(reinterpret_cast(entry_free)) != 0 + || *(reinterpret_cast(entry_204)) != 204'000 + || *(reinterpret_cast(entry_end_offset)) != 0 ) + { + return ResultFailure(); + } + + std::memcpy(reinterpret_cast(entry_free), NewCpuTables, sizeof(NewCpuTables)); + return ResultSuccess(); + } + + static Result CpuVoltRangeHandler(u32* ptr) { + if (!C.eristaCpuMaxVolt) + return ResultSuccess(); + + u32 value_cpu_min_volt = *(ptr - 1); + switch (value_cpu_min_volt) { + case 950: + case 850: + case 825: + case 810: + PatchOffset(ptr, C.eristaCpuMaxVolt); + return ResultSuccess(); + default: + return ResultFailure(); + } + } + + static Result MtcTableHandler(u32* ptr) { + bool replace_entire_table = (C.mtcConf == ENTIRE_TABLE_ERISTA); + if (replace_entire_table) { + EristaMtcTable* const mtc_table_max = reinterpret_cast(ptr - offsetof(EristaMtcTable, rate_khz) / sizeof(u32)); + EristaMtcTable* const table = const_cast(std::addressof(C.eristaMtc)); + std::memcpy(reinterpret_cast(mtc_table_max), reinterpret_cast(table), sizeof(EristaMtcTable)); + } + + return ResultSuccess(); + } + + static Result MemMaxClockHandler(u32* ptr) { + u32 value_next = *(ptr + 1); + constexpr u32 mtc_min_volt = 887; + + Result rc = ResultSuccess(); + if (value_next == mtc_min_volt) { + rc = MtcTableHandler(ptr); + } + + PatchOffset(ptr, C.commonEmcMaxClock); + return rc; + } + + static Result MemVoltHandler(u32* ptr) { + if (C.eristaEmcVolt) + PatchOffset(ptr, C.eristaEmcVolt); + + return ResultSuccess(); + } + + static void Patch(uintptr_t mapped_nso, size_t nso_size) { + enum PatchSuccessCnt { + CPU_CLOCK, + CPU_MAX_VOLT, + MEM_CLOCK, + MEM_VOLT, + CNT_MAX, + }; + + u8 cnt[CNT_MAX] = {}; + + for (uintptr_t ptr = mapped_nso; + ptr <= mapped_nso + nso_size - sizeof(EristaMtcTable); + ptr += sizeof(u32)) + { + u32* ptr32 = reinterpret_cast(ptr); + u32 value = *(ptr32); + + switch (value) { + case CpuClkOSLimit: [[unlikely]] + if (R_SUCCEEDED(CpuDvfsHandler(ptr32, mapped_nso + nso_size))) + cnt[CPU_CLOCK]++; + continue; + case CpuVoltLimit1: [[unlikely]] + case CpuVoltLimit2: [[unlikely]] + case CpuVoltLimit3: [[unlikely]] + if (R_SUCCEEDED(CpuVoltRangeHandler(ptr32))) + cnt[CPU_MAX_VOLT]++; + continue; + case MemClkOSLimit: [[unlikely]] + if (R_SUCCEEDED(MemMaxClockHandler(ptr32))) + cnt[MEM_CLOCK]++; + continue; + case MemVoltHOS: [[unlikely]] + if (R_SUCCEEDED(MemVoltHandler(ptr32))) + cnt[MEM_VOLT]++; + continue; + default: [[likely]] + break; + } + } + + if (!cnt[MEM_CLOCK] || cnt[MEM_VOLT] != 2) { + AMS_ABORT(); + __builtin_unreachable(); + } + } + } + + namespace pcv { + void Patch(uintptr_t mapped_nso, size_t nso_size) { + bool isMariko = (spl::GetSocType() == spl::SocType_Mariko); + if (isMariko) + Mariko::Patch(mapped_nso, nso_size); + else + Erista::Patch(mapped_nso, nso_size); + } + } + namespace ptm { - void ApplyAutoPtmPatch(uintptr_t mapped_nso, size_t nso_size) { + void Patch(uintptr_t mapped_nso, size_t nso_size) { /* No abort here as ptm is not that critical */ - if (spl::GetSocType() == spl::SocType_Erista) + bool isMariko = (spl::GetSocType() == spl::SocType_Mariko); + if (!isMariko) return; perf_conf_entry* confTable = 0; constexpr u32 entryCnt = 16; - constexpr u32 memPtmLimit = MemClkOSLimit * 1000; - constexpr u32 memPtmAlt = MemClkOSAlt * 1000; - constexpr u32 memPtmClamp = MemClkOSClampDn * 1000; - const u32 memPtmMax = EmcClock * 1000; + constexpr u32 memPtmLimit = 1600'000'000; + constexpr u32 memPtmAlt = 1331'200'000; + constexpr u32 memPtmClamp = 1065'600'000; + const u32 memPtmMax = C.commonEmcMaxClock * 1000; - uintptr_t ptr = mapped_nso; - while (ptr <= mapped_nso + nso_size - sizeof(perf_conf_entry) * entryCnt) + for (uintptr_t ptr = mapped_nso; + ptr <= mapped_nso + nso_size - sizeof(perf_conf_entry) * entryCnt; + ptr += sizeof(u32)) { u32 value = *(reinterpret_cast(ptr)); @@ -905,8 +1273,6 @@ namespace ams::ldr { confTable = reinterpret_cast(ptr - offsetof(perf_conf_entry, emc_freq_1)); break; } - - ptr += sizeof(u32); } if (!confTable) diff --git a/Source/Atmosphere/stratosphere/loader/source/ldr_oc_suite.hpp b/Source/Atmosphere/stratosphere/loader/source/ldr_oc_suite.hpp new file mode 100644 index 00000000..657cdfd8 --- /dev/null +++ b/Source/Atmosphere/stratosphere/loader/source/ldr_oc_suite.hpp @@ -0,0 +1,101 @@ +/* + * Copyright (C) Switch-OC-Suite + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + */ + +#define CUST_REV 1 + +namespace ams::ldr::oc { + #include "mtc_timing_table.hpp" + + enum MtcConfig { + AUTO_ADJ_MARIKO_SAFE = 0, + AUTO_ADJ_MARIKO_4266 = 1, + ENTIRE_TABLE_ERISTA = 2, + ENTIRE_TABLE_MARIKO = 3, + CUSTOMIZED_MARIKO = 4, + }; + + typedef struct { + u8 cust[4] = {'C', 'U', 'S', 'T'}; + u16 custRev = CUST_REV; + u16 mtcConf = AUTO_ADJ_MARIKO_SAFE; + u32 marikoCpuMaxClock; + u32 marikoCpuMaxVolt; + u32 marikoGpuMaxClock; + u32 eristaCpuOCEnable; + u32 eristaCpuMaxVolt; + u32 eristaEmcVolt; + u32 commonEmcMaxClock; + union { + EristaMtcTable eristaMtc; + MarikoMtcTable marikoMtc; + MarikoCustomizedTable marikoTiming; + }; + } CustomizeTable; + + enum { + DO_NOT_OVERRIDE = 0, + OVERRIDE_WITH_ZERO = UINT32_MAX, + }; + + inline void PatchOffset(u32* offset, u32 value) { *(offset) = value; } + + inline Result ResultFailure() { return -1; } + + namespace pcv { + typedef struct { + s32 c0 = 0; + s32 c1 = 0; + s32 c2 = 0; + s32 c3 = 0; + s32 c4 = 0; + s32 c5 = 0; + } cvb_coefficients; + + typedef struct { + u64 freq; + cvb_coefficients cvb_dfll_param; + cvb_coefficients cvb_pll_param; // only c0 is reserved + } cpu_freq_cvb_table_t; + + typedef struct { + u64 freq; + cvb_coefficients cvb_dfll_param; // empty, dfll clock source not selected + cvb_coefficients cvb_pll_param; + } gpu_cvb_pll_table_t; + + typedef struct { + u64 freq; + s32 volt[4] = {0}; + } emc_dvb_dvfs_table_t; + + void Patch(uintptr_t mapped_nso, size_t nso_size); + } + + namespace ptm { + typedef struct { + u32 conf_id; + u32 cpu_freq_1; + u32 cpu_freq_2; + u32 gpu_freq_1; + u32 gpu_freq_2; + u32 emc_freq_1; + u32 emc_freq_2; + u32 padding; + } perf_conf_entry; + + void Patch(uintptr_t mapped_nso, size_t nso_size); + } +} \ No newline at end of file diff --git a/Source/Atmosphere/stratosphere/loader/source/ldr_oc_type.hpp b/Source/Atmosphere/stratosphere/loader/source/ldr_oc_type.hpp deleted file mode 100644 index 64158285..00000000 --- a/Source/Atmosphere/stratosphere/loader/source/ldr_oc_type.hpp +++ /dev/null @@ -1,71 +0,0 @@ -#include "mtc_timing_table.hpp" - -constexpr u32 CpuClkOSLimit = 1785'000; -constexpr u32 CpuClkOfficial = 1963'500; -constexpr u32 CpuVoltOfficial = 1120; -constexpr u32 GpuClkOfficial = 1267'200; -constexpr u32 MemClkOSLimit = 1600'000; -constexpr u32 MemClkOSAlt = 1331'200; -constexpr u32 MemClkOSClampDn = 1065'600; - -#define INVALID_MTC_TABLE UINT32_MAX - -typedef struct { - u8 magic[4] = {'C', 'U', 'S', 'T'}; - u32 cpuMaxClock = CpuClkOfficial; - u32 cpuMaxVolt = CpuVoltOfficial; - u32 gpuMaxClock = GpuClkOfficial; - u32 emcMaxClock = 1862'400; - MarikoMtcTable mtcTable = { INVALID_MTC_TABLE }; -} CustomizeTable; - -inline void PatchOffset(uintptr_t offset, u32 value) { - *(reinterpret_cast(offset)) = value; -} - -inline void PatchOffset(u32* offset, u32 value) { - *(offset) = value; -} - -#define ResultFailure() -1 - -namespace pcv { - typedef struct { - s32 c0 = 0; - s32 c1 = 0; - s32 c2 = 0; - s32 c3 = 0; - s32 c4 = 0; - s32 c5 = 0; - } cvb_coefficients; - - typedef struct { - u64 freq; - cvb_coefficients cvb_dfll_param; - cvb_coefficients cvb_pll_param; // only c0 is reserved - } cpu_freq_cvb_table_t; - - typedef struct { - u64 freq; - cvb_coefficients cvb_dfll_param; // empty, dfll clock source not selected - cvb_coefficients cvb_pll_param; - } gpu_cvb_pll_table_t; - - typedef struct { - u64 freq; - s32 volt[4] = {0}; - } emc_dvb_dvfs_table_t; -} - -namespace ptm { - typedef struct { - u32 conf_id; - u32 cpu_freq_1; - u32 cpu_freq_2; - u32 gpu_freq_1; - u32 gpu_freq_2; - u32 emc_freq_1; - u32 emc_freq_2; - u32 padding; - } perf_conf_entry; -} \ No newline at end of file diff --git a/Source/Atmosphere/stratosphere/loader/source/ldr_process_creation.cpp b/Source/Atmosphere/stratosphere/loader/source/ldr_process_creation.cpp index 1c60f8c9..4a5bed22 100644 --- a/Source/Atmosphere/stratosphere/loader/source/ldr_process_creation.cpp +++ b/Source/Atmosphere/stratosphere/loader/source/ldr_process_creation.cpp @@ -23,7 +23,7 @@ #include "ldr_patcher.hpp" #include "ldr_process_creation.hpp" #include "ldr_ro_manager.hpp" -#include "ldr_oc_patch.hpp" +#include "ldr_oc_suite.hpp" namespace ams::ldr { @@ -608,10 +608,10 @@ namespace ams::ldr { /* Apply pcv and ptm patches. */ if (g_is_pcv) { - pcv::ApplyAutoPcvPatch(map_address, nso_size); + oc::pcv::Patch(map_address, nso_size); } if (g_is_ptm) { - ptm::ApplyAutoPtmPatch(map_address, nso_size); + oc::ptm::Patch(map_address, nso_size); } } diff --git a/Source/Atmosphere/stratosphere/loader/source/mtc_timing_table.hpp b/Source/Atmosphere/stratosphere/loader/source/mtc_timing_table.hpp index a96413d1..4bef8b05 100644 --- a/Source/Atmosphere/stratosphere/loader/source/mtc_timing_table.hpp +++ b/Source/Atmosphere/stratosphere/loader/source/mtc_timing_table.hpp @@ -16,6 +16,162 @@ * from GCC preprocessor output */ +struct MarikoCustomizedTable { + struct { + uint32_t emc_rc; + uint32_t emc_rfc; + uint32_t emc_rfcpb; + uint32_t emc_ras; + uint32_t emc_rp; + uint32_t emc_r2w; + uint32_t emc_w2r; + uint32_t emc_r2p; + uint32_t emc_w2p; + uint32_t emc_trtm; + uint32_t emc_twtm; + uint32_t emc_tratm; + uint32_t emc_twatm; + uint32_t emc_rd_rcd; + uint32_t emc_wr_rcd; + uint32_t emc_rrd; + uint32_t emc_wdv; + uint32_t emc_wsv; + uint32_t emc_wev; + uint32_t emc_wdv_mask; + uint32_t emc_quse; + uint32_t emc_quse_width; + uint32_t emc_ibdly; + uint32_t emc_obdly; + uint32_t emc_einput; + uint32_t emc_einput_duration; + uint32_t emc_qrst; + uint32_t emc_qsafe; + uint32_t emc_rdv; + uint32_t emc_rdv_mask; + uint32_t emc_rdv_early; + uint32_t emc_rdv_early_mask; + uint32_t emc_refresh; + uint32_t emc_pre_refresh_req_cnt; + uint32_t emc_pdex2wr; + uint32_t emc_pdex2rd; + uint32_t emc_act2pden; + uint32_t emc_rw2pden; + uint32_t emc_cke2pden; + uint32_t emc_pdex2mrr; + uint32_t emc_txsr; + uint32_t emc_txsrdll; + uint32_t emc_tcke; + uint32_t emc_tckesr; + uint32_t emc_tpd; + uint32_t emc_tfaw; + uint32_t emc_trpab; + uint32_t emc_tclkstop; + uint32_t emc_trefbw; + uint32_t emc_pmacro_ob_ddll_long_dq_rank1_4; + uint32_t emc_pmacro_ob_ddll_long_dq_rank1_5; + uint32_t emc_pmacro_ob_ddll_long_dqs_rank0_0; + uint32_t emc_pmacro_ob_ddll_long_dqs_rank0_1; + uint32_t emc_pmacro_ob_ddll_long_dqs_rank0_3; + uint32_t emc_pmacro_ob_ddll_long_dqs_rank0_4; + uint32_t emc_pmacro_ob_ddll_long_dqs_rank0_5; + uint32_t emc_pmacro_ob_ddll_long_dqs_rank1_0; + uint32_t emc_pmacro_ob_ddll_long_dqs_rank1_1; + uint32_t emc_pmacro_ob_ddll_long_dqs_rank1_3; + uint32_t emc_pmacro_ob_ddll_long_dqs_rank1_4; + uint32_t emc_pmacro_ob_ddll_long_dqs_rank1_5; + uint32_t emc_pmacro_ddll_long_cmd_0; + uint32_t emc_pmacro_ddll_long_cmd_1; + uint32_t emc_pmacro_ddll_long_cmd_2; + uint32_t emc_pmacro_ddll_long_cmd_3; + uint32_t emc_pmacro_ddll_long_cmd_4; + uint32_t emc_zcal_wait_cnt; + uint32_t emc_mrs_wait_cnt; + uint32_t emc_mrs_wait_cnt2; + uint32_t emc_auto_cal_channel; + uint32_t emc_pmacro_dll_cfg_2; + uint32_t emc_pmacro_autocal_cfg_common; + uint32_t emc_dyn_self_ref_control; + uint32_t emc_qpop; + uint32_t emc_pmacro_cmd_pad_tx_ctrl; + uint32_t emc_tr_timing_0; + uint32_t emc_tr_rdv; + uint32_t emc_tr_qpop; + uint32_t emc_tr_rdv_mask; + uint32_t emc_tr_qsafe; + uint32_t emc_tr_qrst; + uint32_t emc_training_vref_settle; + } common; + + struct { + uint32_t emc_pmacro_ob_ddll_long_dq_rank0_0; + uint32_t emc_pmacro_ob_ddll_long_dq_rank0_1; + uint32_t emc_pmacro_ob_ddll_long_dq_rank0_2; + uint32_t emc_pmacro_ob_ddll_long_dq_rank0_3; + uint32_t emc_pmacro_ob_ddll_long_dq_rank0_4; + uint32_t emc_pmacro_ob_ddll_long_dq_rank0_5; + uint32_t emc_pmacro_ob_ddll_long_dq_rank1_0; + uint32_t emc_pmacro_ob_ddll_long_dq_rank1_1; + uint32_t emc_pmacro_ob_ddll_long_dq_rank1_2; + uint32_t emc_pmacro_ob_ddll_long_dq_rank1_3; + } trim_regs; + + struct { + uint32_t rl; + } dram_timings; + + struct { + uint32_t mc_emem_arb_cfg; + uint32_t mc_emem_arb_timing_rcd; + uint32_t mc_emem_arb_timing_rp; + uint32_t mc_emem_arb_timing_rc; + uint32_t mc_emem_arb_timing_ras; + uint32_t mc_emem_arb_timing_faw; + uint32_t mc_emem_arb_timing_wap2pre; + uint32_t mc_emem_arb_timing_r2w; + uint32_t mc_emem_arb_timing_w2r; + uint32_t mc_emem_arb_timing_rfcpb; + uint32_t mc_emem_arb_da_turns; + uint32_t mc_emem_arb_da_covers; + uint32_t mc_emem_arb_misc0; + } burst_mc_regs; + + struct { + uint32_t mc_mll_mpcorer_ptsa_rate; + uint32_t mc_ptsa_grant_decrement; + uint32_t mc_latency_allowance_xusb_0; + uint32_t mc_latency_allowance_xusb_1; + uint32_t mc_latency_allowance_tsec_0; + uint32_t mc_latency_allowance_sdmmca_0; + uint32_t mc_latency_allowance_sdmmcaa_0; + uint32_t mc_latency_allowance_sdmmc_0; + uint32_t mc_latency_allowance_sdmmcab_0; + uint32_t mc_latency_allowance_ppcs_1; + uint32_t mc_latency_allowance_mpcore_0; + uint32_t mc_latency_allowance_hc_0; + uint32_t mc_latency_allowance_hc_1; + uint32_t mc_latency_allowance_avpc_0; + uint32_t mc_latency_allowance_gpu_0; + uint32_t mc_latency_allowance_gpu2_0; + uint32_t mc_latency_allowance_nvenc_0; + uint32_t mc_latency_allowance_nvdec_0; + uint32_t mc_latency_allowance_vic_0; + uint32_t mc_latency_allowance_vi2_0; + uint32_t mc_latency_allowance_isp2_1; + } la_scale_regs; + + uint32_t pllm_ss_ctrl1; + uint32_t pllm_ss_ctrl2; + uint32_t pllmb_ss_ctrl1; + uint32_t pllmb_ss_ctrl2; + uint32_t pllmb_divm; + uint32_t pllmb_divn; + uint32_t min_mrs_wait; + uint32_t emc_mrw; + uint32_t emc_mrw2; + uint32_t emc_cfg_2; + uint32_t latency; +}; + struct MarikoMtcTable { uint32_t rev; char dvfs_ver[60]; @@ -2370,4 +2526,4 @@ struct EristaMtcTable { uint32_t latency; }; -static_assert(sizeof(EristaMtcTable) == 0x1340); \ No newline at end of file +static_assert(sizeof(EristaMtcTable) == 0x1340);