The Facility Engineer is responsible for maintaining and optimizing all the physical infrastructure systems within the data center, including power, cooling, HVAC, backup systems, and building management. The engineer ensures that the data center operates at peak performance, meets reliability standards, and complies with safety and regulatory requirements.
Roles and Responsibilities:
1. Facility Operations & Maintenance
- Monitor and maintain critical infrastructure systems, including HVAC (heating, ventilation, air conditioning), UPS (uninterruptible power supply), backup generators, fire suppression systems, security systems, and electrical systems.
- Conduct regular checks and preventive maintenance on equipment to ensure all systems are functioning properly and to minimize the risk of failure.
- Respond to and resolve operational issues related to building systems (e.g., power outages, cooling system failures, or water leaks).
- Coordinate with vendors for repairs, inspections, and parts replacements as needed.
2. System Performance Monitoring
- Continuously monitor performance data for electrical, mechanical, and HVAC systems to ensure they are operating efficiently and within safe parameters.
- Use building management systems (BMS) or monitoring software to track environmental conditions (temperature, humidity, airflow) and equipment status.
- Provide real-time alerts and notifications for any critical failures, anomalies, or deviations from acceptable operating conditions.
3. Preventive and Corrective Maintenance
- Develop and execute preventive maintenance (PM) schedules for all critical systems, including electrical, mechanical, and HVAC systems, to minimize unplanned downtime.
- Perform corrective maintenance on equipment failures, including troubleshooting, diagnostics, and repairs of equipment.
- Ensure compliance with industry standards and local regulations for safety, efficiency, and reliability.
4. Emergency Response and Incident Management
- Serve as the primary point of contact for facility-related emergencies, including power outages, HVAC failures, and fire alarms.
- Follow emergency response protocols, including coordination with external vendors, emergency services, and internal teams to resolve critical issues and restore service as quickly as possible.
- Document incidents, root causes, and corrective actions taken to prevent future occurrences.
5. Asset Management and Tracking
- Manage inventory of spare parts, tools, and materials required for the maintenance and repair of facility systems.
- Track the lifecycle of critical assets, including purchase, installation, maintenance, and decommissioning of equipment.
- Ensure asset information is recorded and updated in the asset management system for accurate tracking and reporting.
6. Compliance and Safety
- Ensure all systems comply with relevant health, safety, and environmental regulations (e.g., OSHA, EPA) and maintain certifications such as ISO 9001, ISO 27001, or other industry standards.
- Maintain the data center facility in a safe condition, implementing safety protocols and procedures for staff and visitors.
- Conduct safety drills, audits, and inspections to ensure compliance with internal safety policies and external regulatory requirements.
- Ensure that all emergency systems (e.g., fire suppression, emergency lighting, exit routes) are fully operational.
7. Energy Efficiency and Sustainability
- Monitor energy usage and explore opportunities to improve energy efficiency and reduce operational costs while maintaining reliability.
- Recommend and implement upgrades or improvements to systems that enhance energy efficiency and reduce the carbon footprint (e.g., optimizing HVAC operations, improving lighting efficiency).
- Assist in achieving sustainability goals and compliance with green building standards such as LEED or BREEAM.
8. Vendor Management and Coordination
- Work closely with third-party vendors and contractors to schedule maintenance, obtain quotes, and oversee service delivery for various facility services.
- Ensure vendors meet service level agreements (SLAs) and adhere to safety, quality, and compliance standards.
- Coordinate with external parties for major repairs or upgrades to building systems (e.g., electrical, HVAC, fire suppression).
9. Documentation and Reporting
- Maintain detailed records of all maintenance activities, inspections, repairs, and incidents.
- Generate and analyze reports on system performance, energy usage, maintenance activities, and operational metrics for management review.
- Prepare and submit reports on the status of the facility, including energy efficiency, uptime, and planned maintenance activities.
10. Capacity Planning and Infrastructure Design
- Collaborate with IT and operations teams to plan for future capacity requirements and infrastructure upgrades.
- Assist in the design and implementation of system expansions or new installations to accommodate growth in data center operations (e.g., additional power requirements, cooling systems, etc.).
- Plan for redundancy and scalability in critical systems to ensure high availability and reliability.