lighttpd的SQLite问题

最近在使用 lighttpd 的 WebDAV 插件拷贝文件时遇到比较奇怪的 SQLite 库报错问题,WebDAV 插件需要使用 SQLite 数据库保存文件锁、文件属性等信息,报错信息如下:

(mod_webdav.c.2182) sql-set failed: SQL logic error or missing database
(mod_webdav.c.2182) sql-set failed: not an error
(mod_webdav.c.2182) sql-set failed: not an error
(mod_webdav.c.2182) sql-set failed: not an error
(mod_webdav.c.2182) sql-set failed: not an error
(mod_webdav.c.2182) sql-set failed: not an error
(mod_webdav.c.2182) sql-set failed: not an error
(mod_webdav.c.2182) sql-set failed: not an error
(mod_webdav.c.2511) remove lock: bind or column index out of range
(mod_webdav.c.2511) remove lock: bind or column index out of range
(mod_webdav.c.2511) remove lock: bind or column index out of range

我们先看看 mod_webdav.c 中对应的 SQLite 处理代码:

sqlite3_stmt *stmt;

stmt = (0 == xmlStrcmp(cmd->name, BAD_CAST "remove")) ?
	p->conf.stmt_delete_prop : p->conf.stmt_update_prop;

for (props = cmd->children; props; props = props->next) {
	if (0 == xmlStrcmp(props->name, BAD_CAST "prop")) {
		xmlNode *prop;
		int r;

		prop = props->children;

		sqlite3_reset(stmt);

		/* bind the values to the insert */

		sqlite3_bind_text(stmt, 1,
				  con->uri.path->ptr,
				  con->uri.path->used - 1,
				  SQLITE_TRANSIENT);
		sqlite3_bind_text(stmt, 2,
				  (char *)prop->name,
				  strlen((char *)prop->name),
				  SQLITE_TRANSIENT);
		if (prop->ns) {
			sqlite3_bind_text(stmt, 3,
					  (char *)prop->ns->href,
					  strlen((char *)prop->ns->href),
					  SQLITE_TRANSIENT);
		} else {
			sqlite3_bind_text(stmt, 3,
					  "",
					  0,
					  SQLITE_TRANSIENT);
		}
		if (stmt == p->conf.stmt_update_prop) {
			sqlite3_bind_text(stmt, 4,
				  (char *)xmlNodeGetContent(prop),
				  strlen((char *)xmlNodeGetContent(prop)),
				  SQLITE_TRANSIENT);
		}

		if (SQLITE_DONE != (r = sqlite3_step(stmt))) {
			log_error_write(srv, __FILE__, __LINE__, "ss",
					"sql-set failed:", sqlite3_errmsg(p->conf.sql));
		}
	}
}

这里用到了 SQLite 的绑定数据 sqlite3_bind_text 和 sqlite3_step 执行函数。

同时我们可以看看 mod_webdav.c 中打开 SQLite 数据库以及 SQLite 变量绑定方式的 SQL 处理代码:

const char *next_stmt;
char *err;

if (SQLITE_OK != sqlite3_open(s->sqlite_db_name->ptr, &(s->sql))) {
	log_error_write(srv, __FILE__, __LINE__, "sbs", "sqlite3_open failed for",
			s->sqlite_db_name,
			sqlite3_errmsg(s->sql));
	return HANDLER_ERROR;
}

if (SQLITE_OK != sqlite3_exec(s->sql,
		"CREATE TABLE properties ("
		"  resource TEXT NOT NULL,"
		"  prop TEXT NOT NULL,"
		"  ns TEXT NOT NULL,"
		"  value TEXT NOT NULL,"
		"  PRIMARY KEY(resource, prop, ns))",
		NULL, NULL, &err)) {

	if (0 != strcmp(err, "table properties already exists")) {
		log_error_write(srv, __FILE__, __LINE__, "ss", "can't open transaction:", err);
		sqlite3_free(err);

		return HANDLER_ERROR;
	}
	sqlite3_free(err);
}

...

if (SQLITE_OK != sqlite3_prepare(s->sql,
	CONST_STR_LEN("REPLACE INTO properties (resource, prop, ns, value) VALUES (?, ?, ?, ?)"),
	&(s->stmt_update_prop), &next_stmt)) {
	/* prepare failed */

	log_error_write(srv, __FILE__, __LINE__, "ss", "sqlite3_prepare failed:", sqlite3_errmsg(s->sql));
	return HANDLER_ERROR;
}

...

/* LOCKS */

if (SQLITE_OK != sqlite3_exec(s->sql,
		"CREATE TABLE locks ("
		"  locktoken TEXT NOT NULL,"
		"  resource TEXT NOT NULL,"
		"  lockscope TEXT NOT NULL,"
		"  locktype TEXT NOT NULL,"
		"  owner TEXT NOT NULL,"
		"  depth INT NOT NULL,"
		"  timeout TIMESTAMP NOT NULL,"
		"  PRIMARY KEY(locktoken))",
		NULL, NULL, &err)) {

	if (0 != strcmp(err, "table locks already exists")) {
		log_error_write(srv, __FILE__, __LINE__, "ss", "can't open transaction:", err);
		sqlite3_free(err);

		return HANDLER_ERROR;
	}
	sqlite3_free(err);
}

if (SQLITE_OK != sqlite3_prepare(s->sql,
	CONST_STR_LEN("INSERT INTO locks (locktoken, resource, lockscope, locktype, owner, depth, timeout) VALUES (?,?,?,?,?,?, CURRENT_TIME + 600)"),
	&(s->stmt_create_lock), &next_stmt)) {
	/* prepare failed */
	log_error_write(srv, __FILE__, __LINE__, "ss", "sqlite3_prepare failed", sqlite3_errmsg(s->sql));

	return HANDLER_ERROR;
}

...

从上面的代码可以看出,lighttpd 打开 SQLite 数据库之后先创建 properties 表格,然后使用 sqlite3_prepare 函数进行 properties 表格的 SQL 语句预处理,接着创建 locks 表格,同时也使用 sqlite3_prepare 对 locks 表格进行预处理。

使用 sqlite3_prepare 函数而不直接使用 sqlite3_exec 函数的好处是可以直接将 SQL 语句转换为 SQLite 内部的字节码,后面使用时不需要再直接使用冗长的 SQL 语句,直接绑定数据就可以查询或者更新数据了。这里就比较奇怪了,stmt_update_prop 的 SQL 语句看起来也没有什么问题。首先把疑点放在 sqlite3_bind_text 上,怀疑是不是数据中有没有什么特殊字符导致 sqlite3_step 函数执行出错了,但我把 sqlite3_bind_text 的内容全部改成普通字符串还是一样的报错。

搜寻一番之后终于在 sqlite3.h 头文件中发现了端倪:

** The sqlite3_prepare_v2() and sqlite3_prepare16_v2() interfaces are
** recommended for all new programs. The two older interfaces are retained
** for backwards compatibility, but their use is discouraged.、
**
** In the "v2" interfaces, the prepared statement
** that is returned (the [sqlite3_stmt] object) contains a copy of the
** original SQL text. This causes the [sqlite3_step()] interface to
** behave differently in three ways:
**
** If the database schema changes, instead of returning [SQLITE_SCHEMA] as it
** always used to do, [sqlite3_step()] will automatically recompile the SQL
** statement and try to run it again. As many as [SQLITE_MAX_SCHEMA_RETRY]
** retries will occur before sqlite3_step() gives up and returns an error.
**
** When an error occurs, [sqlite3_step()] will return one of the detailed
** [error codes] or [extended error codes]. The legacy behavior was that
** [sqlite3_step()] would only return a generic [SQLITE_ERROR] result code
** and the application would have to make a second call to [sqlite3_reset()]
** in order to find the underlying cause of the problem. With the "v2" prepare
** interfaces, the underlying reason for the error is returned immediately.

sqlite3.h 中建议使用新的 sqlite3_prepare_v2 函数替代老的 sqlite3_prepare 函数,而且中间的注释中还提到使用 sqlite3_prepare 函数之后如果数据库的结构发生了变动,sqlite3_step 函数将会出错。这时我们看看 mod_webdav.c 中初始化数据库的代码可以发现 sqlite3_prepare 函数是在两个 sqlite3_exec 函数创建数据库之间运行的,这样 sqlite3_prepare 生成的字节码就不是正确的,才导致后续的 SQL 更新操作失败。

知道原因之后修改也比较简单了,我们可以修改 mod_webdav.c 中的数据库初始化代码,将两个 sqlite3_exec 创建数据库的操作放在最前面,将所有 sqlite3_prepare 替换为 sqlite3_prepare_v2 函数放在创建数据库之后,这样就不会出现 SQLite 数据库错误问题了。

本文为个人使用分析的结果,其中有任何问题欢迎提出指正,另外 lighttpd 最新版本库代码中这个问题似乎仍然没有修正。