Jan 3, 202616 mins read

hiding strings in java memory

how to prevent strings from appearing in runtime memory and bytecode

what’s the point

imagine you’re building a cheat for a game. you want to hide strings like ur “aimassist” from the evil invasive screensharer of a minehut server who follows a stiff guide on how to screenshare and larps about really knowing how to do digital forensics

this technique makes it harder to find those strings. the string never exists as a complete, searchable block in memory. it’s split into fragments that get stitched together, then thrown back out in the trash

understanding the constant pool

what is it

when you compile java code to a .class file, the compiler creates a constant pool - a table of all literal values (strings, numbers, method references, etc.) used in the program. this is part of the class file format itself, not the executable instructions.

when you write:

System.out.println("hello world");

the string "hello world" is stored as an entry in the constant pool. the actual bytecode instruction is just ldc #3 (load constant #3), which means “go look up entry #3 from the constant pool at runtime and push it onto the stack.”

the ldc instruction

ldc stands for “load constant”. when the jvm executes bytecode, it encounters this instruction and looks up the string from the constant pool. the important thing: this happens at runtime, after the class is already loaded.

what actually happens:

  1. class file has CONSTANT_Utf8 entries (the raw bytes "He", "rr", etc.)
  2. ldc resolves a CONSTANT_String entry that references one of those utf8 entries
  3. the jvm creates a String object from that constant and interns it (stores it in the string pool for the lifetime of the classloader)
  4. your fragment is now a string object sitting in memory

the key: each fragment is loaded and interned separately. the complete string is never loaded as a single constant.

bytecode example

here’s the actual bytecode from the demonstration code. look at what happens when you compile:

System.out.println(String.join("", "He", "rr", "o", " ", "an", "d", " ", "we", "lc", "om", "e", "!"));

the bytecode is:

getstatic java/lang/System.out Ljava/io/PrintStream;
ldc ""                          // load empty string delimiter
bipush 12                        // create array of 12 elements
anewarray java/lang/CharSequence
dup 
iconst_0 
ldc "He"                        // load fragment 1
aastore 
dup 
iconst_1 
ldc "rr"                        // load fragment 2
aastore 
dup 
iconst_2 
ldc "o"                         // load fragment 3
aastore 
dup 
iconst_3 
ldc " "                         // load fragment 4 (space)
aastore 
... (continues for all 11 fragments)
invokestatic java/lang/String.join (Ljava/lang/CharSequence;[Ljava/lang/CharSequence;)Ljava/lang/String;
invokevirtual java/io/PrintStream.println (Ljava/lang/String;)V

notice: each ldc loads a single fragment from the constant pool. there’s no ldc "Herro and welcome!" instruction. the complete string never appears in bytecode.

when you dump this bytecode with a tool like javap or recaf, you see exactly this - individual ldc instructions for each fragment. the constant pool has string constants backed by utf8 entries:

#N = String backed by Utf8 "He"
#N+1 = String backed by Utf8 "rr"
#N+2 = String backed by Utf8 "o"
#N+3 = String backed by Utf8 " "
#N+4 = String backed by Utf8 "an"
#N+5 = String backed by Utf8 "d"
... etc

but never a CONSTANT_String entry for the complete message.

demonstration

here’s the actual code that demonstrates this:

package me.awrped;

import java.util.concurrent.locks.LockSupport;

public class Main {
    public static void main(String[] args) {
        System.out.println(String.join("", "He", "rr", "o", " ", "an", "d", " ", "we", "lc", "om", "e", "!"));
        for (int i = 1; i <= 5; i++) {
            System.out.println(String.join("", "i", " ", "=", " ") + i);
        }

        while (!Thread.currentThread().isInterrupted()) {
            LockSupport.parkNanos(1_000_000_000L);
        }
    }
}

output:

Herro and welcome!
i = 1
i = 2
i = 3
i = 4
i = 5

now here’s the verification part that matters. compile and run this code. then:

  1. open process hacker
  2. find javaw.exe (your java process)
  3. tools → search memory
  4. search for the string “Herro and welcome!”
  5. you won’t find it

try searching for just “Herro” - still nothing. try “welcome” - nothing. but try searching for “He” or “rr” or “om” - those fragments ARE in memory.

this is the whole point: the complete string never exists in memory long enough to scan for it.

why this works

compile time behavior

when you write a normal string:

System.out.println("Herro and welcome!");

javac stores the complete string in the constant pool of the .class file. anyone can find it with strings, javap, decompilers, hex editors, whatever.

but when you use String.join() and fragment it:

String.join("", "He", "rr", "o", " ", "an", "d", " ", "we", "lc", "om", "e", "!")

each fragment goes into the constant pool individually. the bytecode constant pool looks like:

#1 = Utf8    He
#2 = Utf8    rr
#3 = Utf8    o
#4 = Utf8     
#5 = Utf8    an
#6 = Utf8    d
#7 = Utf8    we
#8 = Utf8    lc
#9 = Utf8    om
#10 = Utf8   e
#11 = Utf8   !

the full message never exists in bytecode. String.join() is a method call that happens at runtime, so javac can’t optimize it away like it does with string concatenation ("He" + "rr" + "o" gets folded into "Herro" at compile time).

runtime behavior

here’s what actually happens in memory when the code runs:

  1. jvm loads the class: the constant pool is read into memory. each fragment ("He", "rr", "o", " ", "an", "d", etc.) gets stored individually in the string constant pool. no complete string yet.

  2. String.join() is called: your code executes String.join("", "He", "rr", "o", " ", ...). the jvm doesn’t have a magic function that assembles strings invisibly - it calls the actual String.join() method in the jdk.

let’s look at the actual jdk source to understand what happens internally.

String.join() implementation

String.join() is in src/java.base/share/classes/java/lang/String.java (lines 3802-3834):

public static String join(CharSequence delimiter, CharSequence... elements) {
    var delim = delimiter.toString();
    var elems = new String[elements.length];
    for (int i = 0; i < elements.length; i++) {
        elems[i] = String.valueOf(elements[i]);
    }
    return join("", "", delim, elems, elems.length);
}

it converts the fragments to strings and delegates to an internal join method. this internal method (lines 3836-3884) does the actual concatenation:

@ForceInline
static String join(String prefix, String suffix, String delimiter, String[] elements, int size) {
    int icoder = prefix.coder() | suffix.coder();
    long len = (long) prefix.length() + suffix.length();
    if (size > 1) {
        len += (long) (size - 1) * delimiter.length();
        icoder |= delimiter.coder();
    }
    for (int i = 0; i < size; i++) {
        var el = elements[i];
        len += el.length();
        icoder |= el.coder();
    }
    byte coder = (byte) icoder;
    if (len < 0L || (len <<= coder) != (int) len) {
        throw new OutOfMemoryError("Requested string length exceeds VM limit");
    }
    byte[] value = StringConcatHelper.newArray((int) len);
    
    int off = 0;
    prefix.getBytes(value, off, coder); off += prefix.length();
    if (size > 0) {
        var el = elements[0];
        el.getBytes(value, off, coder); off += el.length();
        for (int i = 1; i < size; i++) {
            delimiter.getBytes(value, off, coder); off += delimiter.length();
            el = elements[i];
            el.getBytes(value, off, coder); off += el.length();
        }
    }
    suffix.getBytes(value, off, coder);
    
    return new String(value, coder);
}

it allocates a byte array of the exact size needed, copies all the fragments into it byte-by-byte, and creates a new string from that array. this all happens in local method scope. the key is what happens next:

System.out.println(String.join(...));

the string is created, passed to println(), written to the output stream, and then the reference is gone. there’s no variable holding it. it becomes eligible for garbage collection immediately.

  1. String.join() allocates memory: inside the join method, a byte[] is allocated large enough to hold all fragments combined. the fragments are copied byte-by-byte into this array. for a fraction of a millisecond, the complete string "Herro and welcome!" exists in memory in this byte array.

  2. new String is created: a new String object wraps this byte array and is returned.

  3. string is printed: the string is immediately passed to System.out.println(), which writes it to stdout. the local variable holding the string reference goes out of scope.

  4. garbage collection: the temporary byte array and String object are now unreferenced and become eligible for garbage collection. depending on when GC runs, the string may exist in memory briefly before being reclaimed.

jit optimizations

hotspot’s PhaseMacroExpand::scalar_replacement() in macro.cpp is what actually eliminates the allocations. here’s how it works:

stage 1: check if allocation can be eliminated

the can_eliminate_allocation() function scans all uses of the allocated object:

// Check the possibility of scalar replacement.
bool PhaseMacroExpand::can_eliminate_allocation(PhaseIterGVN* igvn, AllocateNode *alloc, GrowableArray <SafePointNode *>* safepoints) {
  //  Scan the uses of the allocation to check for anything that would
  //  prevent us from eliminating it.
  NOT_PRODUCT( const char* fail_eliminate = nullptr; )
  DEBUG_ONLY( Node* disq_node = nullptr; )
  bool can_eliminate = true;
  bool reduce_merge_precheck = (safepoints == nullptr);

  Node* res = alloc->result_cast();
  const TypeOopPtr* res_type = nullptr;
  if (res == nullptr) {
    // All users were eliminated.
  } else if (!res->is_CheckCastPP()) {
    NOT_PRODUCT(fail_eliminate = "Allocation does not have unique CheckCastPP";)
    can_eliminate = false;
  } else {
    res_type = igvn->type(res)->isa_oopptr();
    if (res_type == nullptr) {
      NOT_PRODUCT(fail_eliminate = "Neither instance or array allocation";)
      can_eliminate = false;
    } else if (!res_type->klass_is_exact()) {
      NOT_PRODUCT(fail_eliminate = "Not an exact type.";)
      can_eliminate = false;
    } else if (res_type->isa_aryptr()) {
      int length = alloc->in(AllocateNode::ALength)->find_int_con(-1);
      if (length < 0) {
        NOT_PRODUCT(fail_eliminate = "Array's size is not constant";)
        can_eliminate = false;
      }
    }
  }

  if (can_eliminate && res != nullptr) {
    BarrierSetC2 *bs = BarrierSet::barrier_set()->barrier_set_c2();
    for (DUIterator_Fast jmax, j = res->fast_outs(jmax);
                               j < jmax && can_eliminate; j++) {
      Node* use = res->fast_out(j);

      if (use->is_AddP()) {
        const TypePtr* addp_type = igvn->type(use)->is_ptr();
        int offset = addp_type->offset();

        if (offset == Type::OffsetTop || offset == Type::OffsetBot) {
          NOT_PRODUCT(fail_eliminate = "Undefined field reference";)
          can_eliminate = false;
          break;
        }
        for (DUIterator_Fast kmax, k = use->fast_outs(kmax);
                                   k < kmax && can_eliminate; k++) {
          Node* n = use->fast_out(k);
          if (n->is_Mem() && n->as_Mem()->is_mismatched_access()) {
            DEBUG_ONLY(disq_node = n);
            NOT_PRODUCT(fail_eliminate = "Mismatched access");
            can_eliminate = false;
          }
          if (!n->is_Store() && n->Opcode() != Op_CastP2X && !bs->is_gc_pre_barrier_node(n) && !reduce_merge_precheck) {
            DEBUG_ONLY(disq_node = n;)
            if (n->is_Load() || n->is_LoadStore()) {
              NOT_PRODUCT(fail_eliminate = "Field load";)
            } else {
              NOT_PRODUCT(fail_eliminate = "Not store field reference";)
            }
            can_eliminate = false;
          }
        }
      } else if (use->is_ArrayCopy() &&
                 (use->as_ArrayCopy()->is_clonebasic() ||
                  use->as_ArrayCopy()->is_arraycopy_validated() ||
                  use->as_ArrayCopy()->is_copyof_validated() ||
                  use->as_ArrayCopy()->is_copyofrange_validated()) &&
                 use->in(ArrayCopyNode::Dest) == res) {
        // ok to eliminate
      } else if (use->is_SafePoint()) {
        SafePointNode* sfpt = use->as_SafePoint();
        if (sfpt->is_Call() && sfpt->as_Call()->has_non_debug_use(res)) {
          // Object is passed as argument.
          DEBUG_ONLY(disq_node = use;)
          NOT_PRODUCT(fail_eliminate = "Object is passed as argument";)
          can_eliminate = false;
        }
        Node* sfptMem = sfpt->memory();
        if (sfptMem == nullptr || sfptMem->is_top()) {
          DEBUG_ONLY(disq_node = use;)
          NOT_PRODUCT(fail_eliminate = "null or TOP memory";)
          can_eliminate = false;
        } else if (!reduce_merge_precheck) {
          safepoints->append_if_missing(sfpt);
        }
      } else if (reduce_merge_precheck &&
                 (use->is_Phi() || use->is_EncodeP() ||
                  use->Opcode() == Op_MemBarRelease ||
                  (UseStoreStoreForCtor && use->Opcode() == Op_MemBarStoreStore))) {
        // Nothing to do
      } else if (use->Opcode() != Op_CastP2X) { // CastP2X is used by card mark
        if (use->is_Phi()) {
          if (use->outcnt() == 1 && use->unique_out()->Opcode() == Op_Return) {
            NOT_PRODUCT(fail_eliminate = "Object is return value";)
          } else {
            NOT_PRODUCT(fail_eliminate = "Object is referenced by Phi";)
          }
          DEBUG_ONLY(disq_node = use;)
        } else {
          if (use->Opcode() == Op_Return) {
            NOT_PRODUCT(fail_eliminate = "Object is return value";)
          } else {
            NOT_PRODUCT(fail_eliminate = "Object is referenced by node";)
          }
          DEBUG_ONLY(disq_node = use;)
        }
        can_eliminate = false;
      }
    }
  }

#ifndef PRODUCT
  if (PrintEliminateAllocations && safepoints != nullptr) {
    if (can_eliminate) {
      tty->print("Scalar ");
      if (res == nullptr)
        alloc->dump();
      else
        res->dump();
    } else if (alloc->_is_scalar_replaceable) {
      tty->print("NotScalar (%s)", fail_eliminate);
      if (res == nullptr)
        alloc->dump();
      else
        res->dump();
#ifdef ASSERT
      if (disq_node != nullptr) {
          tty->print("  >>>> ");
          disq_node->dump();
      }
#endif /*ASSERT*/
    }
  }

  if (TraceReduceAllocationMerges && !can_eliminate && reduce_merge_precheck) {
    tty->print_cr("\tCan't eliminate allocation because '%s': ", fail_eliminate != nullptr ? fail_eliminate : "");
    DEBUG_ONLY(if (disq_node != nullptr) disq_node->dump();)
  }
#endif
  return can_eliminate;
}

for String.join(), the byte array and resulting String object have no escaping uses - they’re only used internally within the join operation.

stage 2: describe object fields at safepoints

if the allocation can be eliminated, create_scalarized_object_description() reconstructs object field values for any deoptimization points:

SafePointScalarObjectNode* PhaseMacroExpand::create_scalarized_object_description(AllocateNode *alloc, SafePointNode* sfpt) {
  // Fields of scalar objs are referenced only at the end
  // of regular debuginfo at the last (youngest) JVMS.
  // Record relative start index.
  ciInstanceKlass* iklass    = nullptr;
  BasicType basic_elem_type  = T_ILLEGAL;
  const Type* field_type     = nullptr;
  const TypeOopPtr* res_type = nullptr;
  int nfields                = 0;
  int array_base             = 0;
  int element_size           = 0;
  uint first_ind             = (sfpt->req() - sfpt->jvms()->scloff());
  Node* res                  = alloc->result_cast();

  assert(res == nullptr || res->is_CheckCastPP(), "unexpected AllocateNode result");
  assert(sfpt->jvms() != nullptr, "missed JVMS");

  if (res != nullptr) { // Could be null when there are no users
    res_type = _igvn.type(res)->isa_oopptr();

    if (res_type->isa_instptr()) {
      // find the fields of the class which will be needed for safepoint debug information
      iklass = res_type->is_instptr()->instance_klass();
      nfields = iklass->nof_nonstatic_fields();
    } else {
      // find the array's elements which will be needed for safepoint debug information
      nfields = alloc->in(AllocateNode::ALength)->find_int_con(-1);
      assert(nfields >= 0, "must be an array klass.");
      basic_elem_type = res_type->is_aryptr()->elem()->array_element_basic_type();
      array_base = arrayOopDesc::base_offset_in_bytes(basic_elem_type);
      element_size = type2aelembytes(basic_elem_type);
      field_type = res_type->is_aryptr()->elem();
    }
  }

  SafePointScalarObjectNode* sobj = new SafePointScalarObjectNode(res_type, alloc, first_ind, sfpt->jvms()->depth(), nfields);
  sobj->init_req(0, C->root());
  transform_later(sobj);

  // Scan object's fields adding an input to the safepoint for each field.
  for (int j = 0; j < nfields; j++) {
    intptr_t offset;
    ciField* field = nullptr;
    if (iklass != nullptr) {
      field = iklass->nonstatic_field_at(j);
      offset = field->offset_in_bytes();
      ciType* elem_type = field->type();
      basic_elem_type = field->layout_type();

      // The next code is taken from Parse::do_get_xxx().
      if (is_reference_type(basic_elem_type)) {
        if (!elem_type->is_loaded()) {
          field_type = TypeInstPtr::BOTTOM;
        } else if (field != nullptr && field->is_static_constant()) {
          ciObject* con = field->constant_value().as_object();
          // Do not "join" in the previous type; it doesn't add value,
          // and may yield a vacuous result if the field is of interface type.
          field_type = TypeOopPtr::make_from_constant(con)->isa_oopptr();
          assert(field_type != nullptr, "field singleton type must be consistent");
        } else {
          field_type = TypeOopPtr::make_from_klass(elem_type->as_klass());
        }
        if (UseCompressedOops) {
          field_type = field_type->make_narrowoop();
          basic_elem_type = T_NARROWOOP;
        }
      } else {
        field_type = Type::get_const_basic_type(basic_elem_type);
      }
    } else {
      offset = array_base + j * (intptr_t)element_size;
    }

    const TypeOopPtr *field_addr_type = res_type->add_offset(offset)->isa_oopptr();

    Node *field_val = value_from_mem(sfpt->memory(), sfpt->control(), basic_elem_type, field_type, field_addr_type, alloc);

    // We weren't able to find a value for this field,
    // give up on eliminating this allocation.
    if (field_val == nullptr) {
      uint last = sfpt->req() - 1;
      for (int k = 0;  k < j; k++) {
        sfpt->del_req(last--);
      }
      _igvn._worklist.push(sfpt);

#ifndef PRODUCT
      if (PrintEliminateAllocations) {
        if (field != nullptr) {
          tty->print("=== At SafePoint node %d can't find value of field: ", sfpt->_idx);
          field->print();
          int field_idx = C->get_alias_index(field_addr_type);
          tty->print(" (alias_idx=%d)", field_idx);
        } else { // Array's element
          tty->print("=== At SafePoint node %d can't find value of array element [%d]", sfpt->_idx, j);
        }
        tty->print(", which prevents elimination of: ");
        if (res == nullptr)
          alloc->dump();
        else
          res->dump();
      }
#endif

      return nullptr;
    }

    if (UseCompressedOops && field_type->isa_narrowoop()) {
      // Enable "DecodeN(EncodeP(Allocate)) --> Allocate" transformation
      // to be able scalar replace the allocation.
      if (field_val->is_EncodeP()) {
        field_val = field_val->in(1);
      } else {
        field_val = transform_later(new DecodeNNode(field_val, field_val->get_ptr_type()));
      }
    }
    DEBUG_ONLY(verify_type_compatability(field_val->bottom_type(), field_type);)
    sfpt->add_req(field_val);
  }

  sfpt->jvms()->set_endoff(sfpt->req());

  return sobj;
}

this reconstructs the byte array contents ('H', 'e', 'r', 'r', 'o') from the memory chain without actually allocating the array.

stage 3: eliminate the allocation

scalar_replacement() replaces references to the allocated object with the scalarized description:

// Do scalar replacement.
bool PhaseMacroExpand::scalar_replacement(AllocateNode *alloc, GrowableArray <SafePointNode *>& safepoints) {
  GrowableArray <SafePointNode *> safepoints_done;
  Node* res = alloc->result_cast();
  assert(res == nullptr || res->is_CheckCastPP(), "unexpected AllocateNode result");

  // Process the safepoint uses
  while (safepoints.length() > 0) {
    SafePointNode* sfpt = safepoints.pop();
    SafePointScalarObjectNode* sobj = create_scalarized_object_description(alloc, sfpt);

    if (sobj == nullptr) {
      undo_previous_scalarizations(safepoints_done, alloc);
      return false;
    }

    // Now make a pass over the debug information replacing any references
    // to the allocated object with "sobj"
    JVMState *jvms = sfpt->jvms();
    sfpt->replace_edges_in_range(res, sobj, jvms->debug_start(), jvms->debug_end(), &_igvn);
    _igvn._worklist.push(sfpt);

    // keep it for rollback
    safepoints_done.append_if_missing(sfpt);
  }

  return true;
}

when this succeeds, the byte array that String.join() allocates is never created. the individual bytes ('H', 'e', 'r', 'r', 'o') stay in registers/stack. the temporary String object wrapping them is eliminated. a memory scan looking for “Herro” finds nothing because it never exists in memory as a coherent block.

the timing window

here’s the crucial part: when someone opens process hacker and searches for “Herro and welcome!”, they’re looking for that exact byte sequence in memory right now.

the string is created inside the String.join() method (local scope). it gets returned, printed to the console, and then there’s no reference to it anymore. it becomes eligible for garbage collection.

if someone dumps memory while your code is sitting idle (which is most of the time), the complete string likely isn’t there - either it was already garbage collected, or if JIT optimizations kicked in, it was never allocated in the first place. only the fragments in the constant pool remain. and fragments alone don’t give away your secret - they need context to understand what they concatenate into.

compare this to normal strings - those exist in the constant pool forever, and a memory dump at any point shows the complete message right there.

comparison: why string concatenation fails

if you tried to hide a string with normal concatenation:

String msg = "He" + "rr" + "o" + " " + "an" + "d" + " " + "we" + "lc" + "om" + "e" + "!";

java’s compiler is smart enough to see this is all compile-time constants. it optimizes it at compile time to:

String msg = "Herro and welcome!";

now the complete string sits in the bytecode forever. anyone running strings Main.class finds it immediately. and it’s in memory as a constant pool entry.

with String.join("", "str1", "str2", ...), the compiler can’t pre-compute it. it’s a method call that has to happen at runtime.

conclusion

this technique hides strings in runtime memory and now u know how to hide strings for ur questionably coded p2c. when you search javaw.exe process memory, the full string isn’t there. the constant pool has fragments, and the assembled string is garbage collected before scanners can find it. thank you for reading <3 this took me a REALLY long time to write

← back to writing