Конференция "Базы" » Падение SQL Server и его восстановление
 
  • Сергей Суровцев © (16.06.16 00:55) [0]
    День добрый.
    Ситуация такая:
    В процессе создания и заполнения таблицы (промежуточной и неважной, но объемной) кончилось место на диске. Таблица наполовину заполнилась и тормознулась с ошибкой.
    И все бы ничего, место освободил, но после этого сервер стал жаловаться на tempdb.mdf. При работе уже с другими таблицами. Перезапуск его решал вопрос на небольшое время, но буквально 2-3 запроса и та же картина. Вроде бы tempdb.mdf должен пересоздаваться при перезапуске, но что-то видимо оставалось. А потом сервер вообще перестал стартовать. Видимо полетела таблица master.
    Вопрос - как вернуть сервер в работоспособное состояние, так чтобы сохранить все данные. Там больше 40 баз, бекапов естественно актуальных нет, ни данных, ни master. Если можно, подробно, чем это может грозить и как избежав негатива вернуть
    все на круги своя.
  • iop © (16.06.16 08:35) [1]
    попробовать взять мдф'ы (с журналами и без)
    и сделать им sp_attach_db на переустановленном сервере
  • Сергей Суровцев © (16.06.16 10:12) [2]
    Это крайний вариант.
    Других нет?
  • sniknik © (16.06.16 11:27) [3]
    > Видимо полетела таблица master.
    https://support.microsoft.com/ru-ru/kb/298568
  • Сергей Суровцев © (16.06.16 11:47) [4]
    >sniknik ©   (16.06.16 11:27) [3]

    master база конечно.

    Дело в том, что есть сомнения в ее полете...
    Вот лог.
    там только про tempdb...

    2016-06-16 09:47:32.73 Server      Microsoft SQL Server 2012 - 11.0.2218.0 (X64)
    Jun 12 2012 13:05:25
    Copyright (c) Microsoft Corporation
    Developer Edition (64-bit) on Windows NT 6.2 <X64> (Build 9200: )

    2016-06-16 09:47:32.74 Server      (c) Microsoft Corporation.
    2016-06-16 09:47:32.74 Server      All rights reserved.
    2016-06-16 09:47:32.74 Server      Server process ID is 2972.
    2016-06-16 09:47:32.74 Server      System Manufacturer: 'Gigabyte Technology Co., Ltd.', System Model: 'To be filled by O.E.M.'.
    2016-06-16 09:47:32.74 Server      Authentication mode is WINDOWS-ONLY.
    2016-06-16 09:47:32.74 Server      Logging SQL Server messages in file 'F:\MSSQL\DATA\MSSQL11.MSSQLSERVER\MSSQL\Log\ERRORLOG'.
    2016-06-16 09:47:32.74 Server      The service account is 'STAND\perepelov_as'. This is an informational message; no user action is required.
    2016-06-16 09:47:32.74 Server      Registry startup parameters:
     -d F:\MSSQL\DATA\MSSQL11.MSSQLSERVER\MSSQL\DATA\master.mdf
     -e F:\MSSQL\DATA\MSSQL11.MSSQLSERVER\MSSQL\Log\ERRORLOG
     -l F:\MSSQL\DATA\MSSQL11.MSSQLSERVER\MSSQL\DATA\mastlog.ldf
    2016-06-16 09:47:32.74 Server      Command Line Startup Parameters:
     -s "MSSQLSERVER"
    2016-06-16 09:47:32.98 Server      SQL Server detected 1 sockets with 4 cores per socket and 8 logical processors per socket, 8 total logical processors; using 8 logical processors based on SQL Server licensing. This is an informational message; no user action is required.
    2016-06-16 09:47:32.98 Server      SQL Server is starting at normal priority base (=7). This is an informational message only. No user action is required.
    2016-06-16 09:47:32.98 Server      Detected 16273 MB of RAM. This is an informational message; no user action is required.
    2016-06-16 09:47:32.98 Server      Using conventional memory in the memory manager.
    2016-06-16 09:47:33.25 Server      This instance of SQL Server last reported using a process ID of 7772 at 15.06.2016 16:52:41 (local) 15.06.2016 13:52:41 (UTC). This is an informational message only; no user action is required.
    2016-06-16 09:47:33.25 Server      Node configuration: node 0: CPU mask: 0x00000000000000ff:0 Active CPU mask: 0x00000000000000ff:0. This message provides a description of the NUMA configuration for this computer. This is an informational message only. No user action is required.
    2016-06-16 09:47:33.25 Server      Using dynamic lock allocation.  Initial allocation of 2500 Lock blocks and 5000 Lock Owner blocks per node.  This is an informational message only.  No user action is required.
    2016-06-16 09:47:33.27 Server      Software Usage Metrics is disabled.
    2016-06-16 09:47:33.28 spid5s      Starting up database 'master'.
    2016-06-16 09:47:33.29 spid5s      41 transactions rolled forward in database 'master' (1:0). This is an informational message only. No user action is required.
    2016-06-16 09:47:33.29 spid5s      0 transactions rolled back in database 'master' (1:0). This is an informational message only. No user action is required.
    2016-06-16 09:47:33.38 Server      CLR version v4.0.30319 loaded.
    2016-06-16 09:47:33.38 spid5s      Resource governor reconfiguration succeeded.
    2016-06-16 09:47:33.38 spid5s      SQL Server Audit is starting the audits. This is an informational message. No user action is required.
    2016-06-16 09:47:33.38 spid5s      SQL Server Audit has started the audits. This is an informational message. No user action is required.
    2016-06-16 09:47:33.41 spid5s      SQL Trace ID 1 was started by login "sa".
    2016-06-16 09:47:33.42 spid5s      Server name is 'PARTNER21'. This is an informational message only. No user action is required.
    2016-06-16 09:47:33.44 Server      Common language runtime (CLR) functionality initialized using CLR version v4.0.30319 from C:\Windows\Microsoft.NET\Framework64\v4.0.30319\.
    2016-06-16 09:47:33.50 spid13s     A self-generated certificate was successfully loaded for encryption.
    2016-06-16 09:47:33.51 spid13s     Server is listening on [ 'any' <ipv6> 1433].
    2016-06-16 09:47:33.51 spid13s     Server is listening on [ 'any' <ipv4> 1433].
    2016-06-16 09:47:33.51 spid13s     Server local connection provider is ready to accept connection on [ \\.\pipe\SQLLocal\MSSQLSERVER ].
    2016-06-16 09:47:33.51 spid13s     Server local connection provider is ready to accept connection on [ \\.\pipe\sql\query ].
    2016-06-16 09:47:33.51 Server      Server is listening on [ ::1 <ipv6> 1434].
    2016-06-16 09:47:33.51 Server      Server is listening on [ 127.0.0.1 <ipv4> 1434].
    2016-06-16 09:47:33.51 Server      Dedicated admin connection support was established for listening locally on port 1434.
    2016-06-16 09:47:33.51 spid13s     SQL Server is now ready for client connections. This is an informational message; no user action is required.
    2016-06-16 09:47:33.51 Server      SQL Server is attempting to register a Service Principal Name (SPN) for the SQL Server service. Kerberos authentication will not be possible until a SPN is registered for the SQL Server service. This is an informational message. No user action is required.
    2016-06-16 09:47:33.51 Server      The SQL Server Network Interface library could not register the Service Principal Name (SPN) [ MSSQLSvc/Partner21.stand.central ] for the SQL Server service. Windows return code: 0xffffffff, state: 43. Failure to register a SPN might cause integrated authentication to use NTLM instead of Kerberos. This is an informational message. Further action is only required if Kerberos authentication is required by authentication policies and if the SPN has not been manually registered.
    2016-06-16 09:47:33.51 Server      The SQL Server Network Interface library could not register the Service Principal Name (SPN) [ MSSQLSvc/Partner21.stand.central:1433 ] for the SQL Server service. Windows return code: 0xffffffff, state: 43. Failure to register a SPN might cause integrated authentication to use NTLM instead of Kerberos. This is an informational message. Further action is only required if Kerberos authentication is required by authentication policies and if the SPN has not been manually registered.
    2016-06-16 09:47:33.64 spid14s     A new instance of the full-text filter daemon host process has been successfully started.
  • Сергей Суровцев © (16.06.16 11:47) [5]
    2016-06-16 09:47:33.66 spid18s     Starting up database 'ABIZ_OLAP'.
    2016-06-16 09:47:33.66 spid20s     Starting up database 'fleet_reg_all_2015_full'.
    2016-06-16 09:47:33.66 spid21s     Starting up database 'fleet_reg_all_2015_UAZ'.
    2016-06-16 09:47:33.66 spid17s     Starting up database 'msdb'.
    2016-06-16 09:47:33.66 spid19s     Starting up database 'abiz_admin'.
    2016-06-16 09:47:33.66 spid24s     Starting up database 'park_all_2014_full'.
    2016-06-16 09:47:33.66 spid26s     Starting up database 'park_fleet_all_2014_full'.
    2016-06-16 09:47:33.66 spid32s     Starting up database 'fleet_reg_all_2014_full'.
    2016-06-16 09:47:33.66 spid22s     Starting up database 'Geo'.
    2016-06-16 09:47:33.66 spid30s     Starting up database 'fleet_reg_all_2013_full'.
    2016-06-16 09:47:33.66 spid9s      Starting up database 'mssqlsystemresource'.
    2016-06-16 09:47:33.66 spid27s     Starting up database 'reg_rbus_dubl'.
    2016-06-16 09:47:33.66 spid29s     Starting up database 'reg_rtrk_dubl'.
    2016-06-16 09:47:33.66 spid35s     Starting up database 'MapsRussia'.
    2016-06-16 09:47:33.66 spid36s     Starting up database 'Park Ssang Yong'.
    2016-06-16 09:47:33.66 spid37s     Starting up database 'park_all_2013_full'.
    2016-06-16 09:47:33.66 spid33s     Starting up database 'Import'.
    2016-06-16 09:47:33.66 spid31s     Starting up database 'fleet_reg_all_2014'.
    2016-06-16 09:47:33.66 spid34s     Starting up database 'lib'.
    2016-06-16 09:47:33.66 spid28s     Starting up database 'Export'.
    2016-06-16 09:47:33.66 spid38s     Starting up database 'park_car_2014_dv4'.
    2016-06-16 09:47:33.66 spid23s     Starting up database 'park_all_2010_full'.
    2016-06-16 09:47:33.66 spid25s     Starting up database 'park_car_2014'.
    2016-06-16 09:47:33.66 spid39s     Starting up database 'park_trlr_2013'.
    2016-06-16 09:47:33.66 spid40s     Starting up database 'reg_all_2011_full'.
    2016-06-16 09:47:33.66 spid41s     Starting up database 'reg_all_2012_full'.
    2016-06-16 09:47:33.66 spid42s     Starting up database 'reg_all_2013_full'.
    2016-06-16 09:47:33.66 spid43s     Starting up database 'reg_all_2014_full'.
    2016-06-16 09:47:33.66 spid44s     Starting up database 'reg_all_2014_full_UAZ'.
    2016-06-16 09:47:33.66 spid45s     Starting up database 'reg_all_2015_full'.
    2016-06-16 09:47:33.66 spid46s     Starting up database 'reg_all_srokvlad'.
    2016-06-16 09:47:33.66 spid47s     Starting up database 'reg_rtrlr_dubl'.
    2016-06-16 09:47:33.66 spid48s     Starting up database 'reg_all_2015_full_UAZ'.
    2016-06-16 09:47:33.67 spid9s      The resource database build version is 11.00.2218. This is an informational message only. No user action is required.
    2016-06-16 09:47:33.70 spid9s      Starting up database 'model'.
    2016-06-16 09:47:33.70 spid20s     Error: 9003, Severity: 20, State: 9.
    2016-06-16 09:47:33.70 spid20s     The log scan number (35520:19:0) passed to log scan in database 'fleet_reg_all_2015_full' is not valid. This error may indicate data corruption or that the log file (.ldf) does not match the data file (.mdf). If this error occurred during replication, re-create the publication. Otherwise, restore from backup if the problem results in a failure during startup.
    2016-06-16 09:47:33.71 spid20s     Error: 3414, Severity: 21, State: 1.
    2016-06-16 09:47:33.71 spid20s     An error occurred during recovery, preventing the database 'fleet_reg_all_2015_full' (7:0) from restarting. Diagnose the recovery errors and fix them, or restore from a known good backup. If errors are not corrected or expected, contact Technical Support.
    2016-06-16 09:47:33.71 spid20s     Starting up database 'reg_trlr_stat'.
    2016-06-16 09:47:33.78 spid9s      Clearing tempdb database.
    2016-06-16 09:47:33.81 spid9s      Error: 17053, Severity: 16, State: 1.
    2016-06-16 09:47:33.81 spid9s      F:\MSSQL\DATA\tempdb.mdf: Operating system error 1393(Структура диска повреждена. Чтение невозможно.) encountered.
    2016-06-16 09:47:33.82 spid9s      Error: 5173, Severity: 16, State: 1.
    2016-06-16 09:47:33.82 spid9s      One or more files do not match the primary file of the database. If you are attempting to attach a database, retry the operation with the correct files.  If this is an existing database, the file may be corrupted and should be restored from a backup.
    2016-06-16 09:47:33.82 spid9s      Error: 5173, Severity: 16, State: 1.
    2016-06-16 09:47:33.82 spid9s      One or more files do not match the primary file of the database. If you are attempting to attach a database, retry the operation with the correct files.  If this is an existing database, the file may be corrupted and should be restored from a backup.
    2016-06-16 09:47:33.82 spid9s      Error: 1802, Severity: 16, State: 4.
    2016-06-16 09:47:33.82 spid9s      CREATE DATABASE failed. Some file names listed could not be created. Check related errors.
    2016-06-16 09:47:33.82 spid9s      Could not create tempdb. You may not have enough disk space available. Free additional disk space by deleting other files on the tempdb drive and then restart SQL Server. Check for additional errors in the event log that may indicate why the tempdb files could not be initialized.
    2016-06-16 09:47:33.82 spid9s      SQL Trace was stopped due to server shutdown. Trace ID = '1'. This is an informational message only; no user action is required.
  • Сергей Суровцев © (16.06.16 11:52) [6]
    Но ведь tempdb должен переформировываться при каждом старте. А он явно не желает этого делать.
  • sniknik © (16.06.16 12:33) [7]
    > 2016-06-16 09:47:33.81 spid9s      F:\MSSQL\DATA\tempdb.mdf: Operating system error 1393(Структура диска повреждена. Чтение невозможно.) encountered.
    проверка диска, отключение касперского
    https://forum.kaspersky.com/lofiversion/index.php/t241234.html
    ???
  • Сергей Суровцев © (16.06.16 14:49) [8]
    Проблема решилась, но частично.
    Был переустановлен драйвер диска, после чего Сервер запустился.
    Но при работе ругается так:

    The operating system returned error incorrect pageid (expected 1:166210; actual 1:166208) to SQL Server during a read at offset 0x00000051284000 in file 'F:\MSSQL\DATA\tempdb.mdf'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

    А это кусок лога после ошибки:

    2016-06-16 14:24:22.08 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:25:22.13 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:25:22.13 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:26:08.54 spid63      SQL Server encountered: 'incorrect pageid (expected 1:166210; actual 1:166208)' resulting from an attempt to read the following: sort run page (1:166210), in file 'F:\MSSQL\DATA\tempdb.mdf', in database with ID 2. Sort is retrying the read.
    2016-06-16 14:26:08.54 spid63      Error: 823, Severity: 24, State: 7.
    2016-06-16 14:26:08.54 spid63      The operating system returned error incorrect pageid (expected 1:166210; actual 1:166208) to SQL Server during a read at offset 0x00000051284000 in file 'F:\MSSQL\DATA\tempdb.mdf'. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
    2016-06-16 14:26:25.25 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:26:25.25 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:27:25.32 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:27:25.32 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:28:25.44 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:28:25.44 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:29:25.54 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:29:25.54 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:30:25.61 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:30:25.61 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:31:25.62 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:31:25.62 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:32:25.68 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:32:25.68 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:33:25.77 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:33:25.77 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:33:43.84 spid7s      A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 770988, committed (KB): 10239996, memory utilization: 7%.
    2016-06-16 14:34:25.87 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:34:25.87 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:35:25.96 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:35:25.96 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:36:26.08 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:36:26.08 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:37:05.56 spid19s     AppDomain 2 (master.sys[runtime].1) is marked for unload due to memory pressure.
    2016-06-16 14:37:05.56 spid19s     AppDomain 2 (master.sys[runtime].1) unloaded.
    2016-06-16 14:37:26.15 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:37:26.15 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:38:26.24 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:38:26.24 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]
    2016-06-16 14:39:15.67 spid7s      A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 331 seconds. Working set (KB): 843228, committed (KB): 4626756, memory utilization: 18%.
    2016-06-16 14:39:26.29 Logon       Error: 18456, Severity: 14, State: 38.
    2016-06-16 14:39:26.29 Logon       Login failed for user 'STAND\perepelov_as'. Reason: Failed to open the explicitly specified database 'ReportServer'. [CLIENT: <local machine>]

    Причем именно так он вел себя С момента сбоя и ДО полного падения.
  • stas © (16.06.16 16:28) [9]
    А если остановить сервер и удалить tempdb ?
  • Сергей Суровцев © (16.06.16 16:52) [10]
    >stas ©   (16.06.16 16:28) [9]
    >А если остановить сервер и удалить tempdb ?

    Был бы комп для экспериментов, можно попробовать. А так сервак боевой, и завалить его со словами "а что если..." будет невежливо. Желательно точно знать, что делать, а чего не делать.
  • stas © (16.06.16 17:28) [11]
    Так переименуйте tempdb просто. А если что не так, вернете обратно.
  • stas © (16.06.16 17:35) [12]
    При отсутствии этой базы сервер ее создаст заново. Ну или он ругнулся, но запустился без проблем?
  • sniknik © (16.06.16 20:19) [13]
    > Complete a full database consistency check (DBCC CHECKDB).
    ну а чо не выполнишь, если просят?
  • Сергей Суровцев © (17.06.16 00:57) [14]
    >sniknik ©   (16.06.16 20:19) [13]
    >ну а чо не выполнишь, если просят?

    tempdb нельзя перевести в монопольный режим. Как и бекап с нее сделать.

    Да, кстати, по опыту, какое железо оптимально и какое минимально приемлемо для того, чтобы активно пинать таблицы по 40-60Гб? Ибо таких много. Так чтобы их нормально сливать, объединять по нескольку штук за раз и сохранять в тех же примерно объемах. Так чтобы комфортно, за приемлемое время.
  • Сергей Суровцев © (17.06.16 13:00) [15]
    >stas ©   (16.06.16 17:35) [12]
    >Так переименуйте tempdb просто. А если что не так, вернете обратно.

    Проверил ради интереса. Результат не очень обрадовал. База TempDB пересоздалась, но явно косячно. На попытку просмотра ее свойств или таблиц ругалась. Пришлось вернуть все обратно.
  • sniknik © (17.06.16 19:43) [16]
    > tempdb нельзя перевести в монопольный режим. Как и бекап с нее сделать.
    при чем тут монопольный режим, или бэкап? тебе в ошибке написали "выполни утилиту проверки". у тебя, судя по ошибке что-то сбойнуло с авторизацией... типа владелец таблиц тот кто уже уволился, исключен из домена, но по связям в базе должен быть... ну типа того. вот оно и пытается авторизовать "неизвестного" (кстати если даже создать такого же, "тезку", может не прийти там связка по сиду, а они уникальны)
    в общем выполни и посмотри, что она тебе скажет.
  • Сергей Суровцев © (20.06.16 13:56) [17]
    >sniknik ©   (17.06.16 19:43) [16]

    Ну так что там с конфигурацией железа?
 
Конференция "Базы" » Падение SQL Server и его восстановление
Есть новые Нет новых   [134427   +34][b:0][p:0.001]