tashrique-ahmed

Cache Simulator

Cache Simulator

In this project, I built a cache simulator and implemented a transpose function to explore memory systems. It pushed me to think critically about caching, memory access patterns, and optimization. Here's a detailed breakdown of my solutions, including the logic and reasoning behind each implementation.

1. csim.c: The Cache Simulator

Purpose:

This file implements the core logic for simulating cache behavior. It reads a memory trace file and performs operations like tracking hits, misses, and evictions based on the specified cache parameters.

Key Code: Reading the Trace File

void readTraceFile(struct cache *cache, char *contents) {
  char operation;
  unsigned long long int address;
  int size;
  char garbage;

  if (sscanf(contents, "%c%c %llx,%d", &garbage, &operation, &address, &size)) {
    switch (garbage) {
      case 'I': // Instruction loads don't affect the cache
        break;
      default:
        switch (operation) {
          case 'M': // Modify requires two cache accesses
            addCacheAccess(cache, address);
            addCacheAccess(cache, address);
            break;
          default:
            addCacheAccess(cache, address); // Single access
            break;
        }
        break;
    }
  }
}

Analysis:

  • What’s Happening:
    • The function reads each line of a memory trace file and extracts the operation (L, S, or M) and memory address.
    • It handles the specific behaviors for each operation:
      • M (Modify) involves two accesses because it's a load followed by a store.
      • L (Load) and S (Store) involve one access each.
    • The I operation is ignored because instruction fetches don't affect the cache simulation.
  • Why This Approach Works:
    • Using sscanf simplifies parsing the structured trace file.
    • By isolating the behavior of each operation, the function is modular and easy to extend.

Key Code: Main Function

int main(int argc, char **argv) {
  int blockOffsetBits, setBits, associativity;
  struct cache *cache = NULL;
  FILE *file = NULL;
  char *str = (char *)malloc(MAX_LENGTH * sizeof(char));

  while ((c = getopt(argc, argv, "hvs:E:b:t:")) != -1) {
    switch (c) {
      case 's': setBits = atoi(optarg); break;
      case 'E': associativity = atoi(optarg); break;
      case 'b': blockOffsetBits = atoi(optarg); break;
      case 't':
        file = fopen(optarg, "r");
        if (!file) {
          perror("Error opening file");
          return -1;
        }
        break;
      default: break;
    }
  }

  cache = initializeCache(setBits, associativity, blockOffsetBits);

  while (fgets(str, MAX_LENGTH, file) != NULL) {
    readTraceFile(cache, str);
  }

  fclose(file);
  free(str);

  printSummary(cache->hits, cache->misses, cache->evictions);
  freeCache(cache);
  return 0;
}

Analysis:

  • What’s Happening:
    • This is the entry point for the cache simulator. It parses command-line arguments to set cache parameters (s, E, b) and opens the memory trace file.
    • The cache is initialized with the given parameters, and each line of the trace file is processed to simulate cache behavior.
    • Finally, it prints a summary of hits, misses, and evictions, freeing allocated memory.
  • Why This Approach Works:
    • Parsing arguments with getopt ensures flexibility in handling different configurations.
    • Processing the trace file line-by-line minimizes memory usage.
    • Clear separation of tasks (e.g., initialization, trace processing, and summary) keeps the code organized.

2. set.c: Managing Cache Sets

Purpose:

This file manages individual cache sets, including their blocks and metadata like LRU (Least Recently Used) values.

Key Code: Adding Access to a Set

int addSetAccess(struct set *s, unsigned long long int tag) {
  for (int i = 0; i < s->associativity; i++) {
    if (s->blocks[i].valid && s->blocks[i].tag == tag) {
      changeLRU(s, s->blocks[i].lru); // Update LRU
      s->blocks[i].lru = 0;
      return 1; // Hit
    }
  }

  for (int i = 0; i < s->associativity; i++) {
    if (!s->blocks[i].valid) {
      changeLRU(s, i);
      s->blocks[i].valid = 1;
      s->blocks[i].tag = tag;
      s->blocks[i].lru = 0;
      return 0; // Miss
    }
  }

  int evictionIndex = highestLRU(s);
  changeLRU(s, s->associativity);
  s->blocks[evictionIndex].tag = tag;
  s->blocks[evictionIndex].lru = 0;
  return -1; // Eviction
}

Analysis:

  • What’s Happening:
    • The function checks if the requested tag exists in the set (a hit).
    • If not, it tries to place the tag in an empty block (a miss).
    • If the set is full, it evicts the block with the highest LRU value and replaces it with the new tag (an eviction).
  • Why This Approach Works:
    • Using the LRU policy ensures the cache is as efficient as possible under the constraints.
    • The function handles all possible cases (hit, miss, eviction) in a structured way.

Key Code: LRU Management

void changeLRU(struct set *set, int currentLRU) {
  if (currentLRU == 0) return;
  for (int i = 0; i < set->associativity; i++) {
    if (set->blocks[i].valid && set->blocks[i].lru < currentLRU) {
      set->blocks[i].lru++;
    }
  }
}

Analysis:

  • What’s Happening:
    • This function updates the LRU values of blocks in a set to reflect access patterns. Blocks with lower LRU values are incremented to make room for the most recently used block.
  • Why This Approach Works:
    • Incrementing LRU values ensures that the least recently used block always has the highest value, making it easy to evict.

3. trans.c: Matrix Transpose

Purpose:

This file implements a baseline transpose function and serves as a foundation for optimizing memory access patterns.

Key Code: Baseline Transpose

void trans(int M, int N, int A[N][M], int B[M][N]) {
  int i, j, tmp;
  for (i = 0; i < N; i++) {
    for (j = 0; j < M; j++) {
      tmp = A[i][j];
      B[j][i] = tmp;
    }
  }
}

Analysis:

  • What’s Happening:
    • The function iterates through the rows of matrix A and transposes each element into matrix B.
    • This implementation does not optimize for cache locality, resulting in potential cache misses when M or N is large.
  • Why This Approach Works:
    • While not optimized, this function demonstrates the core logic of matrix transposition, making it a good baseline for comparing optimized versions.
  • What Could Be Improved:
    • Blocking techniques or reordering loops to improve cache locality could significantly reduce cache misses.

Final Thoughts:

The Cache Lab was a rewarding exploration into how memory systems operate. From building a cache simulator to optimizing matrix transpositions, I deepened my understanding of caching and efficient data handling. The structured implementation of LRU, trace processing, and optimization techniques reflects my ability to tackle complex, system-level challenges effectively.