Wednesday, March 17, 2010

Error moving the Zenoss database to a new MySQL server

We recently grew our Zenoss 2.3.3 installation by installing a dedicated MySQL server and moving the events database to that server. Events were being collected and written, but we could not move any events to history, and the zenactions log showed the following:
2010-03-17 05:31:26 ERROR zen.Events: (1449, "The user specified as a definer ('zenoss'@'localhost') does not exist")
It turns out this error was coming from MySQL. There is a trigger, status_delete, on the events.status table that controls moving events to history; apparently, the application code is designed to simply delete the status table rows when instructed, and the trigger then moves the rows to the history table. The trigger has a definer specified that indicates the zenoss user must be @locahost (yes, that bold code is commented out):

CREATE
    /*!50017 DEFINER = 'zenoss'@'localhost' */
    TRIGGER `status_delete` BEFORE DELETE ON `status`
    FOR EACH ROW INSERT INTO history SET
            dedupid=OLD.dedupid,
            evid=OLD.evid,
            device=OLD.device,
            component=OLD.component,
            eventClass=OLD.eventClass,
            eventKey=OLD.eventKey,
            summary=OLD.summary,
            message=OLD.message,
            severity=OLD.severity,
            eventState=OLD.eventState,
            eventClassKey=OLD.eventClassKey,
            eventGroup=OLD.eventGroup,
            stateChange=OLD.stateChange,
            firstTime=OLD.firstTime,
            lastTime=OLD.lastTime,
            COUNT=OLD.count,
            prodState=OLD.prodState,
            suppid=OLD.suppid,
            manager=OLD.manager,
            agent=OLD.agent,
            DeviceCLass=OLD.DeviceClass,
            Location=OLD.Location,
            Systems=OLD.Systems,
            DeviceGroups=OLD.DeviceGroups,
            ipAddress=OLD.ipAddress,
            facility=OLD.facility,
            priority=OLD.priority,
            ntevid=OLD.ntevid,
            ownerid=OLD.ownerid,
            deletedTime=NULL,
            clearid=OLD.clearid,
            DevicePriority=OLD.DevicePriority,
            eventClassMapping=OLD.eventClassMapping,
            monitor=OLD.monitor;
$$

DELIMITER ;

I fixed it by changing it to the following, and then dropping and recreating the trigger:

CREATE
    TRIGGER `status_delete` BEFORE DELETE ON `status`
    FOR EACH ROW INSERT INTO history SET
            dedupid=OLD.dedupid,
            evid=OLD.evid,
            device=OLD.device,
            component=OLD.component,
            eventClass=OLD.eventClass,
            eventKey=OLD.eventKey,
            summary=OLD.summary,
            message=OLD.message,
            severity=OLD.severity,
            eventState=OLD.eventState,
            eventClassKey=OLD.eventClassKey,
            eventGroup=OLD.eventGroup,
            stateChange=OLD.stateChange,
            firstTime=OLD.firstTime,
            lastTime=OLD.lastTime,
            COUNT=OLD.count,
            prodState=OLD.prodState,
            suppid=OLD.suppid,
            manager=OLD.manager,
            agent=OLD.agent,
            DeviceCLass=OLD.DeviceClass,
            Location=OLD.Location,
            Systems=OLD.Systems,
            DeviceGroups=OLD.DeviceGroups,
            ipAddress=OLD.ipAddress,
            facility=OLD.facility,
            priority=OLD.priority,
            ntevid=OLD.ntevid,
            ownerid=OLD.ownerid,
            deletedTime=NULL,
            clearid=OLD.clearid,
            DevicePriority=OLD.DevicePriority,
            eventClassMapping=OLD.eventClassMapping,
            monitor=OLD.monitor;
$$

DELIMITER ;

I did try just dropping and recreating the trigger without the change, but it seems the "commented out" definer was responsible for our error, as I duplicated the same error when trying to delete events from the mysql client directly as user zenoss. I then removed the comment completely and recreated the trigger, and the “move to history” function began working.

Note that the age_events stored procedure also has a localhost definer. If you script it out and change the definer specification to the IP of the Zenoss server, that too will function.






Monday, March 08, 2010

Deleting all devices from Zenoss with zendmd

We upgraded our Zenoss 2.3.3 installation to 2.5.2 in a test environment, but wanted to retain only a few of the hundreds of devices that were monitored by the system. We tried deleting these from the web UI, but it would not complete--it always timed out and never seemed to delete anything. I then turned to a zendmd script, which worked wonders:
for dev in dmd.Devices.getSubDevices():
  print dev.id
  dev.deleteDevice()
commit()
reindex()
commit()
This ran quickly and took out all of our devices. We then added back in the ones we wanted. Note: this does not delete performance graphs or events for the affected devices, so you can add them back in and still have this information available.