Issues

Select view

Select search mode

 
50 of 65

Cleanup MongoDB

Description

Any log file within the database that has a create date less than 2020 may be deleted.

Also review the table(s) for this application to determine if there may be other old data that may be removed.

Over and above the CNDC database, there is an old database

    LAMPCMS

Catherine identified as obsolete.   This database is going to be removed as well under this Jira issue.

Environment

None

Attachments

2

Details

Assignee

Reporter

Time tracking

4h logged

Fix versions

Priority

Better PDF Exporter

Created June 9, 2023 at 11:57 AM
Updated June 29, 2023 at 1:21 PM

Activity

Wendy RootJune 13, 2023 at 1:02 PM

Code - Delete, Reindex

BootStrap.groovy

MaintenanceJob.groovy

load.gsp (from a privileged account only)

  • has a checkbox allowing the user to select to Re-index the existing master.

LoadContext.groovy

  • The load.gsp updates this context for the reindex value. The LoadContext is then used by LoadJob.groovy and DirectCertLoadService.groovy

LoadJob.groovy

Teresa WilliamsJune 12, 2023 at 1:32 PM

The reindex you mentioned on StudentMaster is part of the data import executed each month.

Wendy RootJune 9, 2023 at 12:35 PM
Edited

CNDC MongoDB Tables

  • DatabaseInfo

    • Has 1 record

  • Processinglog

    • Has 103,084 records

      There are 69,155 records with a date < '01/01/2020'

  • StudentDistrict

    • Has 0 records

  • StudentMaster

    • Has 602,959 records

  • StudentResult

    • Has 0 Records. This domain had a reference to StudentDistrict which also has 0 records.
      In researching the application, I found that these are temporary tables/collections. There is a maintenance job that goes out and cleans (deleteAll) these table nightly. These tables are populated when a user does an interactive run to get a report of their qualifying students.

  • User

    • 3 Records for Admin accounts

Temporary Collections

There are several 'tmp.mr' collections listed in the CNDC DB.


I found that these are temporary 'map reduce' tables. They are supposed to be deleted automatically when the connection which created them is closed.

Found the following script that someone had posted in Stackoverflow to remove these dead/unwanted tmp collections in an automated manner.

For this issue, we can just do a normal db.collectionName.drop() and see if they come back on a regular basis.

These tmp tables were dropped on 6/9/2023.

Reindex

Reindexed StudentMaster on 6/9/2023

Removed Log Entries prior to 1/1/2020

Completed 6/9/2023

After removing these entries, there are now 33,929 records in the Processinglog collection.

Reindexed after removals - only index is id so doubtful this had any impact.

Compact or Repair

In order to try to reclaim space in the database, we will need to use the Compact command. Before doing so, we need to ensure we have a good backup of the database.

https://www.mongodb.com/docs/v2.6/reference/command/compact/#dbcmd.compact

Another option is the repairDatabase command.
https://www.mongodb.com/docs/v2.6/reference/command/repairDatabase/

For either one of these options, it will lock up the application preventing users from being able to do anything.

The StudentMaster collection is using about 2.6GB of space.
Processinglog is using 23.6MB of space.

The two 'temporary collections' are reserving ALOT of space. Even tho there are no records in these tables at this time, the space they have used is still reserved.
StudentDistrict is using 28.99GB
StudentResult is using 3.3GB

Flag notifications