Cache Simulator

In this project, I built a cache simulator and implemented a transpose function to explore memory systems. It pushed me to think critically about caching, memory access patterns, and optimization. Here's a detailed breakdown of my solutions, including the logic and reasoning behind each implementation.

1. `csim.c`: The Cache Simulator

Purpose:

This file implements the core logic for simulating cache behavior. It reads a memory trace file and performs operations like tracking hits, misses, and evictions based on the specified cache parameters.

Key Code: Reading the Trace File

void readTraceFile(struct cache *cache, char *contents) {
  char operation;
  unsigned long long int address;
  int size;
  char garbage;

  if (sscanf(contents, "%c%c %llx,%d", &garbage, &operation, &address, &size)) {
    switch (garbage) {
      case 'I': // Instruction loads don't affect the cache
        break;
      default:
        switch (operation) {
          case 'M': // Modify requires two cache accesses
            addCacheAccess(cache, address);
            addCacheAccess(cache, address);
            break;
          default:
            addCacheAccess(cache, address); // Single access
            break;
        }
        break;
    }
  }
}

Analysis:

What’s Happening:
- The function reads each line of a memory trace file and extracts the operation (L, S, or M) and memory address.
- It handles the specific behaviors for each operation:
  - M (Modify) involves two accesses because it's a load followed by a store.
  - L (Load) and S (Store) involve one access each.
- The I operation is ignored because instruction fetches don't affect the cache simulation.
Why This Approach Works:
- Using sscanf simplifies parsing the structured trace file.
- By isolating the behavior of each operation, the function is modular and easy to extend.

Key Code: Main Function

int main(int argc, char **argv) {
  int blockOffsetBits, setBits, associativity;
  struct cache *cache = NULL;
  FILE *file = NULL;
  char *str = (char *)malloc(MAX_LENGTH * sizeof(char));

  while ((c = getopt(argc, argv, "hvs:E:b:t:")) != -1) {
    switch (c) {
      case 's': setBits = atoi(optarg); break;
      case 'E': associativity = atoi(optarg); break;
      case 'b': blockOffsetBits = atoi(optarg); break;
      case 't':
        file = fopen(optarg, "r");
        if (!file) {
          perror("Error opening file");
          return -1;
        }
        break;
      default: break;
    }
  }

  cache = initializeCache(setBits, associativity, blockOffsetBits);

  while (fgets(str, MAX_LENGTH, file) != NULL) {
    readTraceFile(cache, str);
  }

  fclose(file);
  free(str);

  printSummary(cache->hits, cache->misses, cache->evictions);
  freeCache(cache);
  return 0;
}

Analysis:

What’s Happening:
- This is the entry point for the cache simulator. It parses command-line arguments to set cache parameters (s, E, b) and opens the memory trace file.
- The cache is initialized with the given parameters, and each line of the trace file is processed to simulate cache behavior.
- Finally, it prints a summary of hits, misses, and evictions, freeing allocated memory.
Why This Approach Works:
- Parsing arguments with getopt ensures flexibility in handling different configurations.
- Processing the trace file line-by-line minimizes memory usage.
- Clear separation of tasks (e.g., initialization, trace processing, and summary) keeps the code organized.

2. `set.c`: Managing Cache Sets

Purpose:

This file manages individual cache sets, including their blocks and metadata like LRU (Least Recently Used) values.

Key Code: Adding Access to a Set

int addSetAccess(struct set *s, unsigned long long int tag) {
  for (int i = 0; i < s->associativity; i++) {
    if (s->blocks[i].valid && s->blocks[i].tag == tag) {
      changeLRU(s, s->blocks[i].lru); // Update LRU
      s->blocks[i].lru = 0;
      return 1; // Hit
    }
  }

  for (int i = 0; i < s->associativity; i++) {
    if (!s->blocks[i].valid) {
      changeLRU(s, i);
      s->blocks[i].valid = 1;
      s->blocks[i].tag = tag;
      s->blocks[i].lru = 0;
      return 0; // Miss
    }
  }

  int evictionIndex = highestLRU(s);
  changeLRU(s, s->associativity);
  s->blocks[evictionIndex].tag = tag;
  s->blocks[evictionIndex].lru = 0;
  return -1; // Eviction
}

Analysis:

What’s Happening:
- The function checks if the requested tag exists in the set (a hit).
- If not, it tries to place the tag in an empty block (a miss).
- If the set is full, it evicts the block with the highest LRU value and replaces it with the new tag (an eviction).
Why This Approach Works:
- Using the LRU policy ensures the cache is as efficient as possible under the constraints.
- The function handles all possible cases (hit, miss, eviction) in a structured way.

Key Code: LRU Management

void changeLRU(struct set *set, int currentLRU) {
  if (currentLRU == 0) return;
  for (int i = 0; i < set->associativity; i++) {
    if (set->blocks[i].valid && set->blocks[i].lru < currentLRU) {
      set->blocks[i].lru++;
    }
  }
}

Analysis:

What’s Happening:
- This function updates the LRU values of blocks in a set to reflect access patterns. Blocks with lower LRU values are incremented to make room for the most recently used block.
Why This Approach Works:
- Incrementing LRU values ensures that the least recently used block always has the highest value, making it easy to evict.

3. `trans.c`: Matrix Transpose

Purpose:

This file implements a baseline transpose function and serves as a foundation for optimizing memory access patterns.

Key Code: Baseline Transpose

void trans(int M, int N, int A[N][M], int B[M][N]) {
  int i, j, tmp;
  for (i = 0; i < N; i++) {
    for (j = 0; j < M; j++) {
      tmp = A[i][j];
      B[j][i] = tmp;
    }
  }
}

Analysis:

What’s Happening:
- The function iterates through the rows of matrix A and transposes each element into matrix B.
- This implementation does not optimize for cache locality, resulting in potential cache misses when M or N is large.
Why This Approach Works:
- While not optimized, this function demonstrates the core logic of matrix transposition, making it a good baseline for comparing optimized versions.
What Could Be Improved:
- Blocking techniques or reordering loops to improve cache locality could significantly reduce cache misses.

Final Thoughts:

The Cache Lab was a rewarding exploration into how memory systems operate. From building a cache simulator to optimizing matrix transpositions, I deepened my understanding of caching and efficient data handling. The structured implementation of LRU, trace processing, and optimization techniques reflects my ability to tackle complex, system-level challenges effectively.